


THE JOURNAL OF 
EDUCATIONAL PSYCHOLOGY 








Volume XXXVII April, 1946 Number 4 








A SYSTEMATIC ERROR IN 
KUHLMANN-ANDERSON MENTAL AGES* 


STAN E. WIMBERLY 


University of Michigan 


The present investigation of the Kuhlmann-Anderson Tests 
developed quite accidentally from another research project in 
which these tests were selected as measures of intelligence for 
a sample of school children. The analysis of these data as well 
as additional data secured from clinical material has led to the 
determination of a sufficiently serious fallacy in the standardiza- 
tion of these tests as to render questionable their uncritical 
use in indicating the intelligence of individuals. 


SCHOOL DATA 


The nature of the original research project was such as to 
require the administration of a battery of tests to a relatively 
homogeneous age group within which representative variations in 
ability would be preserved. Accordingly, a town in Michigan was 
selected which possessed a single school plant containing the first 
twelve grades, and arrangements were made to administer the test 
battery to all children in the school who were within the age 
ranges of ten, eleven, or twelve years on the first day of testing. 
The battery required two half-days to administer; and the three 
age groups were tested separately, each being given one morning 
and one afternoon testing session on successive days or with one 
day intervening. The testing was done by two experienced 





* This study was performed by the author as a part of the research pro- 
gram of the Bureau of Psychological Services. The Bureau is a unit of the 
Institute for Human Adjustment of the Horace H. Rackham School of 
Graduate Studies. Acknowledgement is made to Professors C. H. Griffitts 
and Carl R. Brown of the Department of Psychology of the University of 
Michigan for their critical evaluation of this research. 

193 

















194 The Journal of Educational Psychology . 





clinicians and was completed within the week in May, 1945, 
immediately preceding the period of final examinations at the 
close of the school year. A total of seventy-seven children was 
given the entire battery. These included all the children in the 
school who were in the age groups selected with the exception of 
three children who, because of absences, did not complete the 
battery. 

The contemplated investigation required as precise an evalua- 
tion of intelligence as was possible in a group situation. Practical 
considerations precluded the use of individual tests. The 
Kuhlmann-Anderson Tests were selected as the best available. 
The reader will recall that these tests are thirty-nine in number, 
arranged in order of difficulty to form a scale for the measurement 
of mental development from age five to mental maturity. In its 
ordinary use this scale is divided into overlapping batteries of 
ten successive tests, each battery being assigned to the grade or 
grades for which its difficulty is most appropriate. Each test in 
the scale is individually standardized, and mental age equivalent 
scores are provided for evaluating performance on it. A deter- 
mination of the child’s IQ on the whole battery is achieved by 
taking the median of the ten mental ages yielded by the battery 
and dividing by his chronological age. In an article relating to 
this procedure, Kuhlmann’ discusses the adaptation of the scale as 
an individual test in a way which makes clear its general flexibility. 

** Any section in the series of the thirty-five tests out of which 
the eight group test batteries are built up [an earlier edition of the 
scale is being discussed] may be used in examining an individual, 
and as many or as few tests may be used as time or judgment 
dictates. 'The median of the mental ages earned on the tests that 
are used is his mental age.’’ 

In the Kuhlmann-Anderson Manual,’ in connection with the 
problem of correcting for the effect of tests of improper difficulty 
by administering additional tests in the scale, a statement is 
found to the effect that “‘the median mental age, where additional 
tests are used, would be the median of fourteen or sixteen or any 
even number of tests used.”’ 

This flexibility in the number of tests used appeared to permit 
a variation in the application of the scale to the study under 
consideration which was regarded as increasing the probable 
precision of measurement. This variation was simply to increase 








Error in Kuhlmann-Anderson Mental Ages 195 


the number of tests administered above the usual ten in order to 
increase the reliability of the mental age measurement by taking 
the median value of a larger number of observations of mental 
age. An examination of the age norms given in the Manual for 
the several tests appeared to indicate that the extreme range of 
applicability of the scale to the ages concerned would be exhausted 
with about twenty of the tests. Accordingly, it was decided to 
administer tests 12 through 31 to the ten- and eleven-year-olds, 
and tests 15 through 34 to the twelve-year-olds. This procedure 
was followed. 

Such a departure from ordinary procedure in the use of the 
tests suggested that the results be evaluated as far as possible 
before being used. Moreover, in the writings of the authors of 
these tests one finds many comments emphasizing the importance 
of using tests of the proper difficulty for the mental maturity level 
of the child. In the Manual the statement is found: “If the 
tests are too easy, many will not do their best, and the mental 
ages they earn will be too low.”” Kuhlmann, commenting gen- 
erally on intelligence tests, states: ‘‘ . . . the degree of difficulty 
that a test presents to the child tested determines the degree of 
effort with which he will work onit .. .”’ Anderson, in discuss- 
ing the fifth revision of the scale, may be quoted as follows: 

“The practice recommended by the authors of fitting the test 
booklet to be used to the expected level of the group to be tested 
has proved satisfactory for groups in general. This practice, 
however, does not insure the accuracy of the ratings of the few 
pupils definitely inferior or superior to the median of the group. 
These children, as has been previously recommended, should be 
selected on the basis of the numbers of their zero and maximum 
scores. For the greatest accuracy, those receiving two or more 
zero scores should be given the tests necessary to be rated on the 
next lower booklet. Correspondingly those receiving two or more | 
maximum scores should be given the supplementary tests neces- 
sary to be rated on the next higher booklet.”’ 

In consideration, therefore, of possible inaccuracies relating to 
the use of twenty tests rather than ten, four separate determina- 
tions of the intelligence quotient for each of the seventy-seven 
children were made as follows: 

a) Using a median mental age based on twenty tests with all 
zero and maximum scores included. 














196 The Journal of Educational Psychology 


b) Using a median mental age based on the ten tests recom- 
mended for the grade placement of the child with zero and maxi- 
mum scores included. 

c) Using a median mental age based on twenty tests with all 
zero and maximum scores excluded. 

d) Using a median mental age based on the ten tests recom- 
mended for the grade placement of the child with zero and maxi- 
mum scores excluded. 

Distributions of these various IQ’s were tabulated and sum- 
mary statistics were computed. The product-moment coefficient 
of correlation describing the relationship between IQ’s computed 
on the twenty-test and ten-test basis was calculated both in the 
instance when zero and maximum scores were ignored and in the 
instance when they were allowed to influence the computation of 
mental age. The results of these procedures are summarized in 
Table I. 


TABLE I.—STATISTICAL COMPARISONS OF FouR DETERMINATIONS 
OF I1Q’s BasED ON KUHLMANN-ANDERSON TeEsT SCORES 


Identity of IQ Determination M o ou r* 
a) Twenty Tests Including 

Zero and Maximum 

Scores 96.75 10.95 1.26 
b) Ten Tests Including Zero Tas = +.92 

and Maximum Scores 99.62 13.76 1.58 
c) Twenty Tests Exclud-_. 

ing Zero and Maximum 

Scores 97.14 10.27 1.18 
d) Ten Tests Excluding Zero Toa = +.92 

and Maximum Scores 99.14 12.48 1.43 

* These values may be regarded as spuriously high since they represent 

correlations between arrays of quotients which have chronological ages in 
common. This will be true of other correlation coefficients reported later 
in this study. The reader is reminded that the range of chronological ages 
involved here is three years. See Guilford, p. 252%. It will also be noted 
that in the case of r,, and r.g, the nature of the problem involves the relation- 
ship between values determined from ten tests on the one hand and twenty 
tests including the ten on the other. 


A cursory inspection of these results suggests that the inclusion 
or exclusion of zero and maximum scores makes little difference in 





Error in Kuhlmann-Anderson Mental Ages 197 


the group distributions. The mean IQ differences in the two 
instances when this comparison is possible approximate one-half 
of one IQ point, and, although comparable, standard deviation 
differences are somewhat greater, the substantial differences in 
both these values occur when twenty-test distributions are com- 
pared with the ten-test ones. Moreover, the inclusion or exclu- 
sion of maximum and zero scores has no effect on the magnitude 
of the correlation between twenty- and ten-test variables when 
the coefficients describing these correlations are computed to two 
decimal places. In view of these results, and because the Kuhl- 
mann-Anderson Manual makes practically no provision for dis- 
regarding zero and maximum scores, further analyses were based 
upon quotients determined from mental ages which had been 
allowed to be influenced by these extreme scores. 

In evaluating the reliability of the differences obtained between 
the twenty- and ten-test distributions certain critical ratios 
were computed. The difference between the means of these two 
distributions was found to be 4.41 times as large as the standard 
error of this difference, while the difference between the standard 
deviations yielded a critical ratio of 4.73. These values. were 
accepted as statistically convincing evidence of the generalized 
actuality of the differences involved as distinct from the influence 
of random sampling. Further evidence of the influence of the 
effect under consideration, of course, derives from the fact that 
the product-moment correlation between the two arrays of 
quotients, in spite of its spurious character, is as low as +.92. 

The investigation at this point turned to the question of the 
etiology of the observed differences. Assuming for the moment 
that the group of children used was representative at the ages 
concerned of the American school population, the results obtained 
from determining mental ages on the basis of the ten tests recom- 
mended for their grade placement appeared to give the better 
measurements. This follows from the facts that the mean IQ in 
this case is very near 100 and that the scatter of these values was 
more nearly what is found in the results from other intelligence 
tests. The lowering of the mean IQ when the greater number 
of tests was used would be consistent with the admonition in the 
Manual that if the tests are too easy the resulting mental ages 
will be too low provided that the added tests lowered the relative 
difficulty of the battery. This possibility was examined by 





ee 


eS ee ad . 


es ee | 





a 
4 
: 
re 
ae 
af 
; 
4k 
ie. 
a Hi 


SESS 








198 The Journal of Educational Psychology 


investigating the percentages of maximum and zero scores within 
the ten tests recommended for the grade placement of each child 
in comparison with these percentages for the ten tests falling 
outside of this range. These results are given in Table II. 


TABLE IJ.—PERCENTAGES OF ZERO AND Maximum SCORES ON 
KUHLMANN-ANDERSON TESTS RECOMMENDED FOR GRADE 
PLACEMENT AS COMPARED WITH TESTS OUTSIDE THE 
GRADE PLACEMENT RANGE 


Range of Tests Per Cent Zero Per Cent Maximum 
Ten Tests Recommended 
For Grade Placement 1.29 2.46 
Ten Tests Falling Outside 
Grade Placement Range 2.98 12.20 


It was to be expected that the number of extreme scores would 
increase in the case of the very easy and very hard tests. It was 
also to be expected that children taking a series of these tests 
approximately of correct difficulty for their mental development 
would make more maximum than zero scores. This probability 
is suggested in the following quotation from Kuhlmann: 


“‘It was aimed to so adjust each test that half its items or trials 
would be passed by about two-thirds of the children of the age to 
which the test was assigned.’ 


The absolute values of these percentages, therefore, are of little 
assistance in determining whether the added tests lowered the 
difficulty of the battery. If, however, the relative increase in 
maximum scores for the added tests is greater than the relative 
increase of zero scores, it might be concluded that the twenty-test 
battery was easier for the present group than the ten-test battery. 
The values in Table II indicate that while the relative increase 
in zero scores was by a factor of 2.3, maximum scores increased 
five times in going from the ten tests recommendec for grade 
placement to the ten tests outside this range. If we accept the 
principle that children will earn lower mental ages on easier tests, 
the lowering of the mean IQ on the twenty-test determination 
would appear to be explained. 

Similarly, the decreased variability of the IQ’s determined 
from the twenty tests as compared with the ten could be inter- 








Error in Kuhlmann-Andersen Mental Ages 199 


preted as resulting from the addition of more easy than hard tests. 
Again quoting from Kuhlmann: 


“Speed of performance in many tests reaches a maximum 
because of mechanical limitations, which may be association 
time, hand or eye movement, or other such process involved. 
When a test has become so easy that the majority of children 
in the grade tested approach this mechanical limitation, both dull 
and bright children in the group will make more nearly the same 
score. ‘The second factor lies not directly in the test itself, but in 
the reaction of the children to it. There is probably a consider- 
able general tendency for children to relax in effort in proportion 
to the ease of the task. If a child has to try hard to pass any or 
only a few trials in a test he will make the necessary effort to 
do so. If he can pass quite a number of trials without much 
effort he will not work at his maximum. This results again in a 
reduction of the difference in scores between dull and bright.” 


The results so far obtained, accordingly, appeared explainable 
in terms of the principles laid down by the authors of the tests. 
There were, however, several questions which remained unan- 
swered. For example, did the tendency for children to earn lower 
mental ages on Kuhlmann-Anderson Tests which were too easy 
for them have a significant opposite corollary to the effect that 
Kuhlmann-Anderson Tests which were too difficult would yield 
mental ages which were too high? Moreover, if it developed that 
a child’s mental age obtained from a Kuhlmann-Anderson test was 
partially a function of the difficulty of that test, there remained 
the problem of an orderly determination of the magnitude of that 
effect. : 

In order to answer the first question two more determinations 
of each child’s IQ were made, one from the first ten tests in the 
scale which had been administered to him (the easier half of the 
twenty tests), and the other from the second ten tests (the harder 
half) which he had been given. Since it appears probable 
from the preceding discussion that, on the whole, the entire 
* twenty tests represented a lower degree of difficulty for the chil- 
dren than did the ten tests selected as correct for their grade, one 
would expect from the hypothesis being tested that the average 
IQ determined from the easier half of the twenty tests would be 
further below the average IQ determined from the ten tests recom- 








ee ee ee 


— ga — 
<email emene nine oem ep arate nae nce 
> 4 =~. ine fet 





— 





= ae 









200 The Journal of Educational Psychology 


mended for grade placement (99.62) than the average IQ com- 
puted from the harder half of the twenty tests would be above this 
value. However, if the latter value were found to be appreciably 
above 99.62, the inference would be that Kuhlmann-Anderson 
Tests which were too hard did yield mental ages which were too 
high. The product-moment correlation between these two sets of 
1Q’s was also computed. The results are presented in Table ITI. 


TaBLeE III.—StatisticaL CoMPaARISONS OF KUHLMANN- 
ANDERSON IQ’s BaseD ON TEN Easy AND TEN Harp 
TESTS 


Identity of IQ 
Determination M o ou r 


b) Criterion from Table I 

(Ten Tests Recommend- 

ed for grade placement) 99.62 13.76 1.58 
e) First Ten in Battery of 

Twenty Tests (Easier 


half) 89.56 8.65 .99 

f) Second Ten in Battery nw le 
of Twenty Tests (Harder ud 
half) 103.42 14.10 1.62 


These results appear quite consistent with the proposition that 
Kuhlmann-Anderson Tests which are too easy yield mental ages 
which are too low, and those which are too hard yield mental ages 
which are too high. The mean IQ difference between values 
based on the easier and harder halves of the twenty-test range 
yields a critical ratio of 13.46, and the difference in standard 
deviations for these two arrays of IQ’s has a critical ratio of 6.15. 

In order to investigate the possibility that the advantage 
of hard tests over easy ones in yielding higher mental ages was 
differential with respect to the mental maturity level of the 
individual child, a difference score was computed for each child 
by subtracting his IQ as determined from the ten easy tests from 
the comparable value based on the ten hard tests, and this differ- 
ence score was correlated with his IQ as determined by the ten 
tests recommended for his grade placement. This value was 
computed to be +.66. Although, out of the seventy-seven cases, 








Error in Kuhlmann-Anderson Mental Ages 201 


only four failed to earn higher IQ’s on the harder tests (two had 
difference scores of zero and two had minus difference scores), this 
value indicates that the larger gains from the easier to the harder 
tests were associated with higher intelligence. The most impor- 
tant implication of this would appear to be that there is a limit 
beyond which increase in difficulty of a test does not result in a 
higher mental age. 

The investigation of the magnitude of the influence of test 
difficulty in the Kuhlmann-Anderson scale on the IQ for the 
present sample proceeded with the computation of the median 
mental age for the distribution of seventy-seven children on each 
test separately which all seventy-seven had been given. The 
reader will recall that tests 12 through 14 had not been given to 
the twelve-year-olds, and that tests 32 through 34 had not been 
given to the ten- and eleven-year-olds. Consequently, the 
median of the mental ages earned by all the children on each test 
was computed for only tests 15 through 31. These values are 
presented in Table IV. 


TasBLeE I1V.—MeEpIANs OF DISTRIBUTIONS OF MENTAL AGES (IN 
Montus) EARNED BY SEVENTY-SEVEN SCHOOL CHILDREN 
ON SUCCESSIVE KUHLMANN-ANDERSON TESTS 


Test No. 15 16 17 18 19 20 
Median 124.5 141.7 110.9 122.9 185.5 114.5 
Test No. 21 22 23 24 25 26 
Median 133.6 135.6 135.1 134.5 1386.6 148.1 
Test No. 27 28 29 30 31 

Median 137.9 146.9 146.0 159.9 141.8 


These same values are presented graphically in Figure 1, 
as also are the smoothed values resulting from averaging the first 
six, the middle five, and the last six original values, respectively. 
The relationship is estimated by a straight line drawn as nearly 
through these three smoothed values as was possible. 

The slope of this line of relationship appears to be such as to 
indicate an expected increase of approximately 1.8 to 1.9 months 
of mental age per test as one proceeds from easier to successively 
harder tests throughout the ranges of the Kuhlmann-Anderson 
scale concerned. If the present results may be regarded as 








a 
‘ 


73 Se Rae 


rsa ee Ss ’ 
Ow at ete ae ph tan ete ve 





y Laer ~ he fe? Sted ry &e 


“Pie i= ee’ 
gee ea 


Soc 





eee adn = 


a halle: 
TSE RIS RT NO TA a 


a * ae 

ee, ee Eee 
Ke Meds sis - 
OG et ae 

* 


ge 


202 The Journal of Educational Psychology 


fairly descriptive of what happens when this test is used in the 
school situation, certain probable effects may be noted. 

First, if a child, because of the operation of some circumstance 
unrelated to his mental development, has a school placement 
which is one or more grades distant from that indicated for 
his mental development, there will be a tendency for his Kuhl- 
mann-Anderson IQ to be in error. If his school placement is too 
low, he will be given a range of tests which are too easy for him; 
and his resulting mental age will tend to be too low. If his school 


160 - . 


: 





MENTAL AGE IN MONTHS 
8 8 


110 - . 


esiasie ’ , a a ' , — ' ' a AJ ei ' + —- 
iS 16 17 48 19 20 21 22 23 24 25 26 27 28 29 30 BI 
TEST NUMBER 
Fig. 1.—A graph showing the increase in mental age associated with succes- 
sively more difficult Kuhlmann-Anderson tests. Each plotted point represents 
the median of the distribution of mental ages earned by the seventy-seven chil- 
dren on the particular test indicated. 








placement is too high, the opposite effect will obtain. In either 
event there will be operative an artificial effect which will yield 
for him a mental age that will tend to confirm his present grade 
placement. Between the fourth- and the fifth-grade batteries 
of the Kuhlmann-Anderson scale there is a difference of four tests; 
i.e., the fourth-grade battery is comprised of tests 15 through 
24, and the fifth-grade battery includes tests 19 through 28. 
From the above results we should expect, on the average, a differ- 
ence of something over seven months of mental age when one 
battery is used instead of the other in particular cases. Between 
the fifth- and the sixth-grade batteries, the difference is one of 
three tests, and the effect would be expected to be proportionally 
less. 





Error in Kuhlmann-Anderson Mental Ages 203 


Secondly, in any group, such as a class, exhibiting a variation in 
mental ability, the spread of IQ’s would be curtailed in conse- 
quence of the operation of this factor. This would follow from 
the consideration that superior children with 1Q’s appreciably 
above 100 would find the tests relatively easy and would earn 
mental ages too low, thus displacing their IQ’s downward toward 
100; and children of inferior ability with IQ’s appreciably below 
100 would find the same tests relatively hard and would earn 
mental ages too high, thus displacing their IQ’s upward toward 
100. The net effect would be to decrease the apparent variability 
of the class. This analysis is consistent with Kuhlmann’s 
results‘ which present smaller standard deviations within indi- 
vidual grades for Kuhlmann-Anderson mental ages in comparison 
with a number of other tests. 

It would appear probable that in the ordinary operation of 
a school testing program, no important misinterpretations would 
be made in the evaluation of the large majority of children as a 
result of the effects under discussion. In exceptional cases, how- 
ever, important errors in the evaluation of a child’s mental 
ability could result. In order to investigate further the influence 
of the difficulty of the particular Kuhlmann-Anderson battery 
administered on the magnitude of the measures of intelligence of 
exceptional children, the writer’s attention turned to a group of 
children where the exceptional child was the rule; ie., to the 
clinical population. 


CLINICAL DATA 


In handling the psychological examinations of children at the 
Bureau of Psychological Services, three intelligence tests have 
been included in the routine examination program at the ages 
with whicb the present study is concerned. These are the Stan- 
ford-Binet, the Kuhlmann-Anderson, and the Arthur Performance 
scales. For some time various clinicians had been reporting an 
impression that the Kuhlmann-Anderson scale tended to yield 
IQ’s somewhat below those yielded by the Stanford-Binet. In 
the light of the results discussed above it appeared possible that, if 
true, this phenomenon could be explained by the fact that the 
majority of the children seen for examination by the Bureau pres- 
ents an instance of academic difficulty which had resulted in a 





5 3 


Be RE tine pe es gp mee pe se 





ee ae er 


wh 2. 


An fe 2 





=< Tage © 4 BEA we 
a eS ee ee 


$4 joe Pm > 


7 
Po 
4) 
. ” 
‘ 
EY 
a 
a) 
b ey 
Pe 


ert SiheZ ae 





204 The Journal of Educational Psychology 


school placement below Stanford-Binet mental age expectancy. 
Since the operative clinical procedure in most cases dictated the 
administration of that Kuhlmann-Anderson battery recom- 
mended for the grade placement of the child, children falling into 
the category described above would be given Kuhlmann-Ander- 
son Tests too easy relative to their Stanford-Binet mental ages 
and might, therefore, be expected to achieve lower mental ages 
on the Kuhlmann-Anderson examination than they would on the 
Stanford-Binet. The development of this line of thought leads to 
the formation of a definite hypothesis, as follows: 


a) Children who had been given Kuhlmann-Anderson batteries 
below those indicated by their Stanford-Binet mental ages 
would have, on the average, lower Kuhlmann-Anderson than 
Stanford-Binet IQ’s. 

b) Children who had been given Kuhlmann-Anderson batteries 
above those indicated by their Stanford-Binet mental ages would 
receive, on the average, higher Kuhlmann-Anderson than Stan- 
ford-Binet IQ’s. 


From the files were selected all case reports of children who 
were nine through thirteen years of age, were in school, had been 
given the 1937 revision of the Stanford-Binet, had been given the 
fifth edition of the Kuhlmann-Anderson, and who had taken both 
of these examinations within the same month of chronological age. 
The inclusion of the nine- and the thirteen-year-olds was not 
originally planned, the intention being to restrict the present 
selection of cases to the same age range as had been studied in 
the school sample. The addition of one year on each side of this 
range became necessary to obtain a sufficient number of cases 
which could conform to the criteria of selection. The present 
procedure yielded one hundred sixteen cases. Because about 
one-half of these cases had been given the fifth edition of the 
Kuhlmann-Anderson before the 1942 norms were available, raw 
scores were recorded from the test booklets, and mental ages were 
reassigned on the basis of the 1942 norms. Other pertinent data 
relating to the Stanford-Binet and the Kuhlmann-Anderson scales 
were transcribed. Checking procedures and other precautions 
were followed to prevent clerical errors. 

These cases were separated into subgroups according to whether 








Error in Kuhlmann-Anderson Mental Ages 205 


the Kuhlmann-Anderson battery administered was too high, 
correct, or too low relative to Stanford-Binet mental age, and the 
extent of the discrepancy was recorded. By using the Metropoli- 
tan Achievement Test age-grade equivalents table, the grade 
equivalent corresponding to each child’s Stanford-Binet mental 
age was determined. This value was compared for each child 
with the grade for which the Kuhlmann-Anderson battery admin- 
istered to him was recommended. By subtracting his mental 
age equivalent grade from the grade for which the Kuhlmann- 
Anderson battery administered to him was recommended, a 
score was obtained which indicated for each child in grade units 
the extent to which the Kuhlmann-Anderson battery given him 
was too difficult, correct, or too easy. Plus values indicated too 
difficult Kuhlmann-Anderson batteries, and minus values the 
reverse. The distribution of cases in terms of these values is 
given in Table V. 


TABLE V.—DEFINITION OF THE Srx CATEGORIES INTO WHICH THE 
CLINICAL SUBJECTS WERE DISTRIBUTED 


Category Description of Group N 
+2  K-A Battery Two Grades Above That Indicated by 
S-B Mental Age 11 
+1 K-A Battery One Grade Above That Indicated by 
S-B Mental Age 19 


0 #K-A Battery Correct Relative to S-B Mental Age 35 
—1 K-A Battery One Grade Below That Indicated by 


S-B Mental Age 23 
—2  K-A Battery Two Grades Below That Indicated by 

S-B Mental Age 17 
—3.6 K-A Battery Three or More Grades Below That 

Indicated by S-B MA (Av. = 3.6) 11 


The examination of the hypothesis proceeded by computing the 
mean Stanford-Binet IQ and the mean Kuhlmann-Anderson IQ 
for each category of cases defined in Table V. A direct compari- 
son of these values within a given category, however, would be 
inadequate for the purpose of evaluating the hypothesis for two 
reasons. First, it is possible that there exists a difference 








a 2 ~ wb ~ tae a —_—o “all Bn 
+ RNR: Sateen yg 1 

* ie “ a het ‘sane 

we ett ae » ++ Se . Os: +8 a 


vargtininn 4. Oe 


SS J ow rag 7. 
=, ** 
ee ha 


s+ So 


spiced ‘ 
Pagl ehn 2 ee 
wees : 


‘ 3 hy Re Se, eed eat 4 eS ee ee ee 
ee Serer cr oe oe ee ‘ aay i re ine OE a ae 
- on Be enc ETE Zighat ihc Ae < a oe SSE one ae ST har« as ee ea es Ae 
e3 eee ee - a ee ee es " 
wre a oo ae — 


206 The Journal of Educational Psychology 


between Stanford-Binet and Kuhlmann-Anderson IQ’s such that 
one scale would yield consistently lower values than the other as 
a result of differences in the average intelligence of the respective 
standardization samples. In such a case the operation of this 
factor could produce differences in the mean IQ’s for the two 
scales within a given category which would be unrelated to the 
difficulty level of the Kuhlmann-Anderson batteries administered. 

The second objection to a direct comparison of the mean Kuhl- 
mann-Anderson IQ with the mean Stanford-Binet IQ for a given 
category depends upon the fact that the method of classifying the 
clinical cases into the six categories presented in Table V incor- 
porated a selective factor of such a nature that the average level 
of intelligence of the subjects in each category increased markedly 
from the ‘plus’ categories to the ‘minus’ ones. Since the two 
scales are not perfectly correlated, the expected Kuhlmann- 
Anderson mean IQ for a given category would not be the same as 
the Stanford-Binet mean but would be a value predicted from the 
Stanford-Binet mean on the basis of the regression of Kuhlmann- 
Anderson IQ’s on Stanford-Binet IQ’s.* 

Therefore, the indicated procedure would appear to be to 
predict the mean Kuhlmann-Anderson IQ for each category from 
the corresponding Stanford-Binet value, and compare this 
predicted value with the obtained Kuhlmann-Anderson mean IQ. 
For such predictions, however, the correlation coefficient describ- 
ing the relation between the two sets of IQ’s and the means and 
standard deviations of both variables are necessary. Unfor- 
tunately, the anomalous characteristic of the Kuhlmann-Ander- 
son scale which is being studied would be expected to influence 
this correlation coefficient and the standard deviation of the 
Kuhlmann-Anderson IQ’s. 

Referring again to Table V, it may be noted that for the ‘zero’ 
category, the rationale of this study indicates that the Kuhlmann- 
Anderson battery administered was of the correct difficulty. 
For these thirty-five cases, therefore, the defect in the Kuhlmann- 
Anderson scale under investigation should have a negligible 





* The original treatment of these data did not take into account this con- 
sideration of statistical regression. The writer is indebted to Professor 
Quinn McNemar of Stanford University for calling this matter to his 
attention. 








Error in Kuhlmann-Anderson Mental Ages 207 


influence on the obtained IQ’s. Using this group as a base for 
computing the regression constants, in spite of the small number 
of cases, appeared to provide the best procedure for evaluating 
the hypothesis which could be followed under the restrictions of 
these data. Accordingly, for the thirty-five cases of the ‘zero’ 
category, the correlation between the Stanford-Binet and Kuhl- 
mann-Anderson IQ’s and the means and standard deviations 
for both arrays were computed. These values are presented in 


Table VI. 


TABLE VI.—STATISTICAL RESULTS COMPARING STANFORD-BINET 
AND KUHLMANN-ANDERSON FOR THE ‘ZERO’ CATEGORY OF 
CLINICAL SUBJECTS 


Identity of IQ M og r 


Stanford-Binet......... 89.74 14.17 4 84 
Kuhlmann-Anderson... 84.09 11.97 


Using the values in Table VI and the mean Stanford-Binet IQ 
for a given category, a predicted Kuhlmann-Anderson meanIQ 
was obtained for each category by substituting in the ordinary 
regression equation. This value was subtracted in each category 
from the obtained mean Kuhlmann-Anderson IQ, and the result- 
ing differences were taken as a basis for evaluating the hypothe- 
sis. Table VII presents the results of this procedure. 


TABLE VII.—STATISTICAL COMPARISONS OF THE MEAN 
STANFORD-BINET AND KUHLMANN-ANDERSON IQ’s FOR 
EacuH CatTeGcory OF CASES DEFINED IN TABLE V 


Mean Predicted Obtained Obtained Mean K-A 
S-B Mean Mean IQ Minus Predicted 


Category N IQ K-AIQ K-AIQ Mean K-A IQ 


+2 11 75.73 74.15 82.45 +8.30 
+1 19 78.42 76.06 78.11 +2.05 

0 35 89.74 84.09 84.09 0.00 
—1 23 97.96 89.92 85.17 —4.75 
—2 17 113.41 100.88 96.06 —4.82 


—3.6 11 133.09 114.85 106.55 —8.30 





oe 


rd See Re cing 
poems: 


RO meted On Es I NTE 


ES ey Sh rest yea es sirens 


ae Ne 
eg 
SPs ies Se 


SPREE INE gi 





OE? ee RN ee eee Cae se et Se 
PeReang grr PF A te 
Rarer oe hae ot 











~-- é - 
¥ =f " 


St Se aes 


os 
ee: 


eure. 
+ oe 


ior. A ae 


¥ “a F ~ svensie 
ie 5 : ae A ed Sees 





208 The Journal of Educational Psychology 


These results are consistent with those to be expected from the 
hypothesis. It will be recalled that the ‘plus’ categories con- 
tained subjects for whom the Kuhlmann-Anderson batteries 
administered had been of greater difficulty than were indicated 
by their Stanford-Binet.mental ages and that the hypothesis 
required that their Kuhlmann-Anderson IQ’s be above those to 
be expected from their Stanford-Binet IQ’s. With respect to 
the ‘minus’ categories exactly the reverse situation obtained. 
The organization of Table VII is such that positive deviations in 
the last column indicate higher obtained mean Kuhlmann- 
Anderson IQ’s than would be expected from the mean Stanford- 
Binet IQ’s and negative deviations are associated with lower 
obtained mean Kuhlmann-Anderson IQ’s than were predicted 
from the mean Stanford-Binet values. Inspection of Table VII 
will indicate that all five observations which are significant in 
the evaluation of the hypothesis conform to the requirements of 
the hypothesis. 

Because of the small number of cases upon which the regression 
constants were computed and the small numbers of cases which 
are the basis for each of the five observations, the results of Table 
VII should be regarded only as a rough indication of the probable 
magnitude of the effect on the IQ which results from the adminis- 
tration of a Kuhlmann-Anderson battery of improper difficulty. 
The irregularity of the relationship between the amount of dis- 
placement in difficulty of the Kuhlmann-Anderson battery 
administered relative to the one indicated by Stanford-Binet 
mental age and the discrepancy between resulting Stanford-Binet 
and Kuhlmann-Anderson IQ’s is probably due to the relative 
unreliability of each separate observation. 

In general, however, these clinical data definitely exhibit the 
effects of the same characteristic of the Kuhlmann-Anderson 
scale that was isolated in the analysis of the school data reported 
above, viz., that Kuhlmann-Anderson Tests which are of too 
great difficulty yield mental ages and IQ’s which are too high 
and Kuhlmann-Anderson Tests which are too easy yield cor- 
responding values which are too low. 

The reader will recall that in the earlier pages of this report 
Anderson was quoted to the effect that to insure the greatest 
accuracy children receiving two or more zero scores should be 








Error in Kuhlmann-Anderson Mental Ages 209 


given the tests necessary to be rated on the next lower grade 
battery, and that those receiving two or more maximum scores 
should be given the supplementary tests necessary to be rated on 
the next higher grade battery. With this in mind the writer 
analyzed the present clinical data for the frequency of zero and 
maximum scores to determine if these criteria would select those 
cases for whom the greatest discrepancy between Kuhlmann- 
Anderson and Stanford-Binet IQ’s had been found. The dis- 
tribution of individuals earning zero and maximum scores in the 
various combinations of these which occurred for each category 
of grade difference between the Kuhlmann-Anderson battery 
administered and the one indicated by Stanford-Binet mental 
age, as well as the total percentages of zero, maximum, and both 
zero and maximum scores found for the individuals in each 
category, are presented in Table VIII. 

Examination of Table VIII indicates that only a very general 


TABLE VIII.—DIstTRIBUTIONS oF ZERO AND MAxIMuM KUHL- 
MANN-ANDERSON TEST ScoRES SEPARATELY FOR EACH 
CATEGORY OF CLINICAL CASES 


Category ~ —3.6 -2 —-1 0 +1 +2 
i ee oe: a oe a 


4 Max. Scores; No Zero 
3 Max. Scores; No Zero 
2 Max. Scores; No Zero 1 

1 Max. Score; No Zero l 3 
4 Max. Scores; 1 Zero 1 
2 Max. Scores; 1 Zero 
1 Max. Score; 1 Zero 2 2 
1 Max. Score; 4 Zero 

No Max. Score; 1 Zero 2 
No Max. Score; 2 Zero 1 
No Max. Score; 3 Zero 
No Max. Score; 4 Zero 
No Max. Score; No Zero 
Per cent Max. Scores 


wo we 
— 


bo 
ee ee ee ee ee 
on 


9 
a 
0 
7 


m= wD 
bd Ob 
non 
Co mH 
a 
— 
OWan 
PRO 
nnon 


oom o 
om b& 


2 
Per cent Zero Scores 0. 
Per cent Max. and Zero Scores 2 





210 The Journal of Educational Psychology 


and diffuse relationship exists between the degree of displacement 
of the Kuhlmann-Anderson battery administered relative to that 
indicated by Stanford-Binet mental age and the occurrence of 
zero and maximum scores. Stating it differently, the absence 
of zero scores does not appear to guarantee that the difficulty of 
the battery is not too high; and the absence of maximum scores 
does not necessarily indicate that the difficulty of the battery is 
not too low. 

Several results should be specifically noted in Table VIII. 
The criterion of ‘two or more zero’ or ‘two or more maximum’ 
scores applied to these data will select unambiguously only seven 
of the eighty-one cases for whom the difficulty level of the Kuhl- 
mann-Anderson scale used was inappropriate, and only one of 
these seven represents a displacement of the proper battery by 
more than one grade category. Six other cases exhibited ‘two or 
more maximum’ or ‘two or more zero’ scores, and these six are 
included in the ‘zero’ category for which the rationale of the 
present study indicates the battery administered was of the 
correct difficulty. Eight cases out of the one hundred sixteen 
exhibited some combination of both zero and maximum scores. 


The majority of cases in every category exhibited neither zero 
nor maximum scores. 


INTERPRETATION OF RESULTS 


An explanation for the vagueness of the relationship between 
zero scores and relatively high difficulty of tests administered, 
on the one hand, and between maximum scores and relatively 
low difficulty of tests administered, on the other, may be found 
in an opinion expressed by Kuhlmann which is quoted earlier in 
this paper. The statement indicated that the degree of difficulty 
which a test presents to a child will determine the extent of his 
effort in working on it. According to this interpretation, maxi- 
mum scores would tend not to occur when the child works on a 
test too easy for him because he is not sufficiently challenged by 
the relatively simple task. Conversely, when the test is too 
difficult, there will be a tendency for highly motivated effort to 
result in an avoidance of zero scores. 

This same motivational explanation may be applied generally 





Error in K uhlmann-Anderson Mental Ages 211 


to the principal result isolated from the present procedures; viz., 
that a Kuhlmann-Anderson battery which is too easy for the child 
yields mental ages which are too low, and a battery which is too 
difficult for the child yields mental ages which are too high. 
According to the theoretical position under consideration, this 
effect would obtain because in both cases mental age norms would 
be assigned which were determined from the performances of a 
standardization group for which the battery concerned had been 
of the correct difficulty, eliciting, on the average, an intermediate 
degree of effort. 

This motivational explanation, however, does not recommend 
itself on the basis of a critical evaluation. It is too pat, and it 
presents a one-sidedness of viewpoint to the exclusion of equally 
possible, oppositely acting, motivational factors. One can 
imagine with different results to explain an interpretation to the 
effect that on easy tests the child, encouraged by success, is 
spurred on to greater efforts; and on difficult tests the child, 
discouraged by failure, gives up. As a matter of fact, such 
phenomena are commonly observed by the practicing clinical 
psychologist. The present interpretation rejects as primary the 
motivational explanation presented above both in connection 
with the results of this study and in connection with certain of 
the findings reported by the authors of the scale in their examina- 
tions of Kuhlmann-Anderson Tests data. It is preferable to 
explain these results in terms of a fallacy in the standardization 
procedures used in the development of this scale. 

Credit for noticing this fallacy is due unreservedly to Professor 
Carl R. Brown of the University of Michigan, who was kind 
enough to turn his attention from his own research to 9 con- 
sideration of the results presented in this paper. 

The essential nature of the standardization process followed 
is indicated in the Manual for the Kuhlmann-Anderson Tests by 
the statement that, ‘‘The average age was determined at which 
one trial and no more, two trials and no more, etc., were passed.” 
A mental age norm appears to have been assigned, therefore, to 
each possible score above zero on each test on the basis of the 
average age of the children who make that particular score. 
This is a reversal of the usual procedure in assigning mental age 
norms in which the average performance of a group of children 








212 The Journal of Educaticnal Psychology 


of the same chronological age is determined, and that chrono- 
logical age is assigned as the mental age norm corresponding to 
their average test performance. These two procedures represent 
the assigning of mental age norms in accordance with different 
regression lines in the bivariate distribution of test score and 
chronological age. The former (the procedure followed in the 
standardization of the Kuhlmann-Anderson Tests) makes use 
of the regression line of chronological age on test score, and the 
latter employs the regression line of test score on chronological 
age. 

Thurstone’ in contrasting these two procedures some years ago 
indicated that the mental age norms resulting from them would 
be different and concluded, therefore, that the concept of mental 
age is ambiguous. Godfrey H. Thomson‘, in replying to Thur- 
stone’s article, carried his analysis of the problem much further. 
He concluded that in the ideal situation in which the standardi- 
zation group included an adequate sampling not only of all 
individuals of various chronological ages but also of all individuals 
who would make all possible test performances the difference in 
the two procedures for assigning mental age norms would sub- 
stantially disappear. This conclusion was based upon a more 
primary result of his analysis which indicated that under these 
ideal conditions test score and chronological age would yield a 
peculiar type of correlation surface for which the two regression 
lines would be practically coincident although the correlation 
would not be unity. He points out, however, that if the selec- 
tion of the standardization group is confined to a particular age 
range, the regression of chronological age on test score will 
become an artifact determined by the arbitrary restriction of the 
age range, but that the regression of test score on chronological 
age will be unaffected by the artificial nature of the distribution 
of chronological age. This distinction between the two regres- 
sion lines is the pertinent one for the present discussion. The 
reader is referred to Thomson’s paper for an entirely convincing 
demonstration of it, should this be necessary. 

It appears quite definite that in practice any sampling of chil- 
dren for the purpose of standardizing an age scale of intelligence 
will constitute an arbitrary, artificial sampling of chronological 
age to some degree, and, accordingly, that the resulting regression 





Error in Kuhlmann-Anderson Mental Ages 213 


of chronological age on test score will to some extent be arti- 
factual. That the sampling of chronological age was extremely 
arbitrary in the case of the latest standardization of the several 
Kuhlmann-Anderson Tests is suggested by the following state- 
ment from the Kuhlmann-Anderson Manual: 

“In determining the original norms, every test was given to a 
very wide range of school grades and ages. Since then it was 
learned that children adjust their efforts to the difficulty of the 
task, resulting in many of the older children’s passing no more 
trials on a test than the younger for whom the tests were more 
difficult. Thus the average age of children passing most or all 
trials in a test was raised above what it should have been.” 

Since this statement occurs in the discussion of the superiority 
of the more recent norms, it would appear that these norms for a 
particular test were based on a more restricted sampling of 
chronological ages than were the original ones, thereby emphasiz- 
ing their artifactual character. Moreover, the last sentence in 
the quotation illustrates nicely the artificial nature of mental age 
norms based on the regression of chronological age on test score. 
When older children were included the mental age norm for higher 
test scores was raised, and when these were excluded the norm was 
lowered. The average age of children making a given test per- 
formance depends importantly on the range and distribution of 
the chronological ages arbitrarily included in the sample. It is 
superfluous to attempt to explain this on a motivational basis. 
Rather, the explanation will become obvious if the reader will 
experiment by arbitrarily cutting off age sections of a scatter 
diagram of test score and chronological age and observing the 
effect on the regression of chronological age on test score for the 
remaining part of the correlation diagram. It should also be 
noticed that such an arbitrary treatment of the sampling of 
chronological ages does not affect the other regression line. 

It is the purpose of the present discussion to demonstrate that 
the selection of the regression of chronological age on test score 
as the basis for the mental age norms assigned to the Kuhlmann- 
Anderson Tests has produced inconsistencies sufficient to explain 
the principal results of the present study of this scale. Figure 2 
represents a purely hypothetical instance of the standardization 
of a single Kuhlmann-Anderson Test. In order to simplify the 





emrnitg ig: 


VDiites 2: 
= ie soll 


Y 





f 
thy 


- ~ 


= Kesha , . 
PT cs ag peer eae een 


oe eh 


7 ea 
* j - 
«* : 
BD ot 
ae 
Sor 
ea 
‘= 
o a 
oo. 
Se 
4 - Nat 
‘- y 
ae 
ey 
s 
My 
ha! 
7 
eat 4 
sit 
i 
. 
ie 
EtG 
, 
: a 
ry 
M $ 
2 
Y 
. a 
i. 
hy wa 
o Pie 
: { 
’ ? 
* a - 
cm 
AY 
wh 
1 
ma 
F 


= i ee ps 
ala pam giana (peri ee > 
£ RE Fo 


we: 
at toe 


E 
> wee 


* 
ve > wa ee 
~ epee 3 


214 The Journal of Educational Psychology 


exposition, let us consider a normal correlation surface represent- 
ing the relationship between score on the test and chronological 
age. This condition of normality of distribution for each of these 
variables, although not true in the actual situation, does not affect 
the fundamental consideration being illustrated. This follows 
from the fact that when the sampling of chronological ages 
involves a restricted range the two regression lines will tend to be 





K-A 
MA 
1-8 
1O-I! 
10-5 


10-0 


RAW SCORE 











Ls 


Li 7 


9 10 
CHRONOLOGICAL AGE 


Fie. 2.—Graphical representation of an hypothetical instance of the standardiza- 
tion of a single Kuhlmann-Anderson test. 


elie ee ee 


! 
i 
| 
' 
A 
' 
a! 
T 
8 


in the same relationship with respect to each other as would 
be the case if the two variables were normally distributed. How- 
ever, the inclusion of this condition very probably distorts the 
magnitude of the effect under discussion. The ellipse is one of 
the contour lines of the correlation surface, and is drawn to 
represent approximately a degree of relationship described by a 
correlation coefficient of +.80. 





Error in Kuhlmann-Anderson Mental Ages 215 


Examination of Figure 2 will reveal that the mental age norms 
found in the extreme left-hand column were assigned on the basis 
of the regression line ab which represents the regression of 
chronological age on test score. This completes the analogy 
with the standardization procedures applied to the Kuhlmann- 
Anderson Tests inasmuch as this line is drawn through the means 
of the age distributions corresponding to various test scores. 
We have discussed at length the artifactual character of this 
regression line, but now, having permitted an arbitrary deter- 
mination of the distribution of chronological ages to yield it, we 
may proceed to an examination of the inconsistencies which result. 

Universal clinical practice, in connection with the use of age 
scales or any measurements of factors the development of which 
are associated with chronological age, evaluates a given perform- 
ance in terms of the average performance found for a given 
chronological age. The average child of any particular chrono- 
logical age is defined as that child whose performance is equal to 
the average performance of children of the same chronological 
age. Such average performances are indicated in Figure 2 by the 
line cd, inasmuch as this line is drawn through the means of the 
distributions of test score corresponding to different chronological 
ages. This is the regression line of test score on chronological 
age. 

Let us examine some of the implications of this arrangement 
in terms of the findings of the present study. These results 
indicated that the administration of Kuhlmann-Anderson Tests 
which were of too great difficulty for the child’s mental develop- 
ment resulted in mental ages which were too high, and, con- 
versely, that such tests which were of too little difficulty yielded 
mental ages which were too low. Only when the difficulty of the 
tests was exactly correct were the mental ages correct. Referring 
to Figure 2, we find for the one point in the age scale at which 
this hypothetical test exactly fits (nine years and six months), 
that children of nine and one-half years of age, on the average, 
earn a raw score of five points and a mental age of nine and one- 
half years. If, however, this test is made too difficult by being 
administered to a group of eight-year-old children, they will earn 
an average score of three points and an average mental age of 
eight years and seven months. Moreover, if the test is made too 








216 The Journal of Educational Psychology 


easy by being given to a group of eleven-year-olds, they, on the 
average, will receive a raw score of seven points and a mental age 
of ten years and five months. 

Illustrating the same point somewhat differently, seven-year- 
old children, on the average, would achieve an IQ rating on this 
test of 113; and these same children when they reach the age of 
twelve years would receive an average IQ of 92. It also follows 
that any child achieving a raw score below 5 on this test will be 
assigned a mental age which is too high, and any child who earns 
a raw score above 5 will receive a mental age rating which is 
too low. This relates directly to the finding in the analysis 
of the school data that the same children at the same time 
received successively higher mental age ratings on successively 
more difficult Kuhlmann-Anderson Tests. On the easier tests 
their performances tended to be above the score point correspond- 
ing to that point on the age scale at which each test was exactly 
of the correct difficulty, and on the harder tests their performances 
tended to be below such points. It should be understood that 
the foregoing illustration with its numerical examples should not 
be interpreted as indicative of the actual magnitude of the effect 
being illustrated. Its purpose is merely to provide a general 
understanding of the basic processes which are operating to 
yield the empirical results of this study. 

In any bivariate distribution of intelligence test score and 
chronological age for which the latter variable is arbitrarily 
restricted, there will be two ‘lines of regression.’ The regression 
of test score on chronological age will be unaffected by the 
artificial nature of the distribution of chronological age. The 
regression of chronological age on test score, however, will be 
artifactual as a result of the arbitrary characteristic of the 
distribution of chronological age. The mental age norms assigned 
to the various tests of the Kuhlmann-Anderson scale were based 
on the regression of. chronological age on test score. Against 
the particular organization of the Kuhlmann-Anderson scale, the 
effects of assigning mental age norms on this basis are such that 
any child will tend to earn successively higher mental ages on 
successively more difficult tests; and when the test battery as a 
whole is too difficult or too easy, the median mental age earned by 
the child will be respectively too high or too low. These are 





Error in Kuhlmann-Anderson Mental Ages 217 


precisely the characteristics of the Kuhlmann-Anderson Tests 
which emerged from the analysis of the data of the present 
investigation. Since these inconsistencies would not have 
existed had the mental ages been assigned according to the regres- 
sion of test score on chronological age, it would appear that the 
fallacy in the standardization of this scale results from the use of 
the wrong regression line. 

In many respects other than the one with which this report 
is concerned, these tests appear to be excellent. It is to be hoped 
that means will be found to restandardize them upon a more 
adequate basis. 


SUMMARY 


The measurements yielded by the Kuhlmann-Anderson Tests 
were investigated in a group of seventy-seven school children and 
with results from one hundred sixteen clinical subjects. Analysis 
of these data indicated that such measurements were influenced 
by the difficulty of the particular tests selected for administra- 
tion. Within a given group of subjects, increased difficulty of 
tests administered was associated with increased MA and IQ 
values. The most satisfactory explanation was found to be 
related to a fallacy in standardization procedures leading to the 
use of the wrong regression line in the bivariate distribution of 
test score and chronological age as a basis for the assigning of 
mental age norms. 


BIBLIOGRAPHY 


1) Anderson, RoseG. ‘The Fifth Revision of the Kuhlmann- 
Anderson Tests.” J. appl. Psychol., 1940, 24, 198-206. 

2) Guilford, J. P. Fundamental Statistics in Psychology and 
Education. New York: McGraw-Hill, 1942. 

3) Kuhlmann, F. ‘A Median Mental Age Method of Weight- 
ing and Scaling Mental Tests.” J. appl. Psychol., 1927, 11, 181- 
198. 

4) . “The Kuhlmann-Anderson Intelligence Tests 
Compared With Seven Others.” J. appl. Psychol., 1928, 12, 
545-594. 

5) . “Effect of Degree of Difficulty on Operation 








of Intelligence Tests.” J. juv. Res., 1930, 14, 8-21. 








218 The Journal of Educational Psychology 


6) Thomson, Godfrey H. ‘The Mental Age Concept and the 
Standardization of Group Tests.” Psychol. Rev., 1928, 35, 
398-413. 

7) Thurstone, L. L. ‘‘The Mental Age Concept.” Psychol. 
Rev., 1926, 33, 268-278. 

8) Kuhlmann-Anderson Intelligence Tests: Instruction Manual. 
(Fifth Edition). Minneapolis: Educational Test Bureau, 1942. 





ADJUSTMENT OF ADOLESCENT DAUGHTERS OF 
EMPLOYED WOMEN TO FAMILY LIFE! 


MARY ESSIG 
North Kansas City (Missouri) High School 


and 


D. H. MORGAN 
Colorado A. and M. College 


Today economic changes have altered ways of living in many 
homes. Some homemakers have had to take over the job of 
bread-winners, others have felt it their patriotic duty to help in 
the period of manpower shortage, and still others have been 
unable to withstand the lure of high wages for satisfying the 
material wants of themselves or their family. Those mothers 
who for one reason or another have engaged in gainful employ- 
ment have consequently had less time for homemaking. 

It is natural for a child who is unhappy in his family relation- 
ships to seek satisfaction away from home. If anti-social activi- 
ties prevail in the community, he is more susceptible to the 
attractions of delinquency than is another child who finds satisfac- 
tion in hishome. As Keliher (* p. 336) has pointed out, one very 
important of all possible causes of maladjustment is the fact that 
the mother is employed, but the surprising thing is that worse 
signs of maladjustment are not found in many children. 

Because of the widespread opinion that delinquent tendencies 
in children result when their mothers are employed outside the 
household and because maladjustment to the home and family 
is one of the first steps toward juvenile delinquency, a study to 
determine whether adolescent daughters of employed mothers 
are more poorly adjusted to family life than are those whose 
mothers do not work seems particularly pertinent at this time. 
The problem has been divided into the following questions: 

1) How well adjusted to family life are the adolescent daughters 
of employed women? 





1 Presented as Master’s thesis, Summer Session, 1945, Colorado A. and M. 
College and rewritten by special permission of the Dean of the Graduate 
School. The writers are indebted to Dr. Maude Williamson, Colorado 
A. and M. College, for her critical comments and suggestions. 

219 





a §B. . 
i — ’ 





Pi ENS NR is A RE Re eng RE 


4 rouge. > 
"ae > pot Ba» wea 
F oH ~ <A & 
°. . 
‘2 % 
“2? i? “= 


- s < * - — ~ > me nn ‘ 9 aaa * 
or 5 » Pre. - a, . ” : 
: : Seen LPR GEER + nck Ah 
- “a aeR "1 Sa ad = Sas . ae Fe ” 
PRS Sl OREM TE IOP A ARIES Oh ey ees SRE SSO BR OS ie ME BE an gate PS eo 


PS ainseeats 


220 The Journal of Educational Psychology 


2) How well adjusted to family life are the adolescent daughters 
of women who do not work outside the home? 
3) How do the above groups compare in this respect? 


THE PRESENT INVESTIGATION 


For determining the effect of the employment of women upon 
the adjustment of their daughters to family life, two data-gather- 
ing devices were administered to five hundred girls enrolled in 
ninth- and tenth-grade home economics classes—an adjustment 
scale and a short questionnaire. 

Data-gathering devices.—An adjustment scale by Leland Stott 
entitled ‘Home Life,’ a revision of his ‘Family Life’ Scale,’ was 
selected as being the most valid for use in this study. Con- 
cerning this scale, the author stated in a letter: 

“The Home Life scale is the same as the one use in the study 
to which you referred, except that it is somewhat more refined. 
It includes only those items that were found to be most closely 
related to the attitude measured. The reliability as well as the 
validity of the scale was improved. With both high school and 
college students the reliability coefficients based on the present 
eighty items were about .94.” 

The scale contained eighty questions to be answered by encir- 
cling one of three responses; ‘F,’ ‘O,’ ‘R,’ which were defined as 
follows: 


F means—frequently, usually, most of the time, nearly 
always, etc. 

O means—occasionally, once in a while, sometimes, etc: 

R means—rarely, very seldom, almost never. 


The short questionnaire previously mentioned which accom- 
panied the scale was designed to secure information about the 
community, the home of the girl, and the work of the mother. 


ADMINISTRATION OF DATA-GATHERING DEVICES 


In order to obtain a sample from various types of communities, 
ten home economics teachers in Missouri were asked to codperate 
in securing the information. Those asked to assist were either 
teachers known by one of the writers or recommended by the 
State supervisor of home economics as a teacher who had the 
confidence and good will of the girls in her community. 





Adjustment of Daughters of Employed Women 221 


Because it was believed that the answers of the girls would be 
more valid if the names did not appear on either the scale or the 
questionnaire, the scales and the questionnaires were numbered 
from 1 to 500 inclusively. The teachers who administered these 
forms gave each girl a scale with the same number as that on her 
questionnaire. 

The questionnaire and scale from each girl were clipped together 
and sorted according to those whose mothers worked full time for 
pay and those whose mothers did not work outside the home. 
One hundred and sixty-three records were discarded because the 
mothers worked intermittently, the girls came from broken homes, 
or the records were incomplete. Since there were one hundred 
fifty-one complete records for the girls whose mothers worked, a 
like number by communities from the girls whose mothers did not 
work were scored as they came without selection. 


DESCRIPTION OF SAMPLES 


The girls included in this study came from eleven communities 
in Missouri, an equal number of each group, experimental and 
control, from each community. According to the classification 
of the communities in the 1940 census, three-fourths of the girls 
were from urban communities. Of these three-fourths, more than 
one-third came from North Kansas City which is a part of Greater 
Kansas City. One-fourth of the total number of girls came from 
small communities classified as rural. 


TABLE 1.—AGeEs or GIRLS IN StupDy CLASSIFIED ACCORDING TO 
EXPERIMENTAL GROUP AND CONTROL GROUP 











Experimental Group Control Group 
N = 151 N = 151 
Ages of Girls Per cent Per Cent 
} 

18 m 7 
17 2.6 4.0 
16 14.6 8.6 
15 32.4 33.1 
14 43.7 43.7 
13 6.0 9.9 














a 





222 The Journal of Educational Psychology 


The girls in this sample ranged in age from thirteen to eighteen 
years (Table 1). The ages of the two groups were fairly com- 
parable with the majority of both groups either fourteen or fifteen 
years old. 

Almost three-fourths of the employed mothers in this sample 
had worked two or more years (Table 2), and more than one- 
fourth had been employed for five or more years. 


TABLE 2.—NvUMBER OF YEARS MOTHERS OF GIRLS IN 
EXPERIMENTAL Group Hap BEEN EMPLOYED 


Number of Years Per Cent of Mothers (N = 151) 


6 or more 19.2 
8.0 
10.0 
15.8 
19.2 
27.8 


Almost eighty-five per cent (Table 3) of the employed mothers 
of the girls in this study worked during the day. While fifteen per 
cent did some night work, only six per cent did night work 
exclusively. 


TABLE 3.—Time Moruers or GIRLS IN EXPERIMENTAL GROUP 
WERE EMPLOYED 
Time Per cent of Mothers (N = 151) 
Day 84.7 
Night 6.0 
Both 9.3 


Approximately ninety-seven per cent (Table 4) of the mothers 
who did not work were usually at home when their daughters 
arrived. Two per cent of the mothers who did not work were 
rarely at home, while almost one-half of the mothers who worked 
were rarely at home when their daughters arrived. 


DIFFERENCES IN ADJUSTMENT OF THE TWO GROUPS 


The eleven school communities varied in the mean or average 
score on the daily adjustment scale for each group. In each case, 
however, the control group scored higher than did the experi- 
mental group (Table 5). The difference of the means between the 





Adjustment of Daughters of Employed Women 


223 


TABLE 4.—NvuMBER OF MoTHERS oF SAMPLE AT Home WHEN 
DAUGHTERS ARRIVED FROM SCHOOL, CLASSIFIED ACCORDING 
TO EXPERIMENTAL GROUP AND CONTROL GROUP 











Experimental Group | Control Group 
Mother at Home When N = 151 N = 151 
Girl Arrives 
Per Cent Per Cent 
Frequently............ 36.4 96.7 
Occasionally........... 16.6 1.3 
RSS ie eae tre a 47.0 2.0 











experimental and the control groups ranged from 21 points in the 
case of West Plains to only seven points for North Kansas City. 
The median was higher in each of the schools for the control group. 


TABLE 5.—MEAN, MEDIAN, AND MODIFIED RANGE OF SCORES, OF 
GirRLs, EXPERIMENTAL GROUP AND CoNTROL GROUP 
ACCORDING TO COMMUNITIES 









































Mean Median oe 
Communit 
. Ex- Con- Ex Con- Ex- Con- 
No. | peri- trol Diff | peri- trol Diff | peri- trol 
mental mental mental 

North Kansas 
A 40 41.10/48.40 7.30) 43 .33/48 .33 —~ 59.67/44 . 67 
St. Charles...| 25 41.60/58 .96/17.36) 43.12}58.93)15.81| 39.67/18.67 
Warrenburg. .| 15 52.33/59.66| 7.33) 51.25}60.62) 9.37| 24.33/19.00 
West Plains...| 13 37 .31/58.77|21.46) 45.60/59.16)13.56) 40.67/22.33 
Lexington....| 12 47 .08/62 .50)15.42) 45.00|50.00) 5.00) 24.00/20.00 
Raytown..... 11 | 40.82/56.36/15.54| 41.25/56.25/15.00| 33.34/22. 67 
Cameron..... 8 49 .12/60.12)11. 53.33/65 .00)11.67| 27.00/20.00 
Windsor...... 7 40. 14/60.00)19.86) 42.50/58 .75)16.25 24 0018.66 
Lee’s Summit. 7 50.28/65. 2815.00) 52.50/64.12)11.62) 13.33)11.67 
Platte City... 7 49.00/59 .42)10.42) 48.75/54.37| 5.62) 21.67) 3.67 
Rich Hill... .. 6 | 45.00'62.00/17.00| 50.00/62.50/12.50 wt.cae 67 





1The modified range is the difference between the average of the three 
highest and the three lowest scores for each group. 





ce oom 
it fants —~ > 


yoo 





224 The Journal of Educational Psychology 


The modified range, the difference between the average of the 
three highest and the three lowest scores, was greater for the 
experimental group than for the control group for all the com- 
munities. Despite the small numbers from some of the school 
communities, these figures indicate that in all of the communi- 
ties, the girls whose mothers did not work seemed better adjusted. 

Whereas almost one-third of the experimental group made more 
than fifty per cent unfavorable responses (score of 40 or less), only 
six per cent of the control group were that low (Table 6). It is 


TABLE 6.—FREQUENCY DISTRIBUTION OF ScorRES MADE By GIRLS 
oF SAMPLE ON Hog Lire ScALE, ACCORDING TO EXPERIMENTAL 
Group AND ConTROL GROUP 





Experimental Group Control Group 





Per 
Cent 


Cumu- 
lative 
Per 
Cent 


Per 
Cent 


Cumu- 
lative 
Per 
Cent 





— 


— 

Ss 
SCHWORSCCABSCOOSD 
to oo on Go 3 6 Ff 
SCWODODOMROD BORO 


DOSRSHOOR 
HOR AARWHHDABE SAL 
BERBIRSS 

oo “S898 


WWHWHKDKDONMIWHA PR RO 
— — 
DHE NW HOORNOCON 


bt & cr cw 


—_ 
— jt 
— & 1 Ww © 


WwWAOOa 
WHOOMRSSTANOANDOOSOSD 











S 
Oo 




















Adjustment of Daughters of Employed Women 225 


significant to note that while the experimental group is consider- 
ably lower than the control group, certain individuals in the 
former appear relatively well-adjusted to their family life, accord- 
ing to their responses to the scale employed in this study, and 
considerably more so than do a large percentage of the control 
group. As one would expect, there is considerable overlapping 
between the two groups. 

As is evident in Table 6 there is a considerable difference 
between the two groups in terms of mean scores and variability. 
An analysis of these differences shows that the control group is 
significantly higher in mean score and that the experimental 
group is significantly more variable (Table 7). 


TABLE 7.—MEANS AND STANDARD DEVIATIONS OF RESPONSES 
oF Grrts ON Home Lire, EXPERIMENTAL AND CONTROL 


GROUPS 
Experimental Control Diff SE, CR! 
8 oo ee 43.8 . 656.1 12.3 1.5 8.4 
Standard Deviation. 14.8 10.4 4.4: i282 43 


1CR, critical ratio, is the statistic divided by its standard error. 


COMPARISON OF RURAL AND CITY GIRLS 


In order to determine whether or not the type of community 
affected the adjustment of girls to home life, the two groups were 
further divided according to urban and rural. Of the four groups 


TaBLE 8.—SIGNIFICANCE OF DIFFERENCES OF MEAN AND 
STANDARD DEVIATION OF URBAN AND RurRAL GIRLS OF 
SAMPLE, EXPERIMENTAL AND CONTROL GROUPS 

Experi- 
mental Control 
N=113 N=113 Diff SED CR 


Urban 
NS oe ine ala stale: 0 43.5 54.8 11.3 1.7 6.5 
Standard Deviation. 15.1 10.7 4.4 1.2 3.6 
N =38 N = 38 
Rural 
Ee on ey 44.9 60.2 15.3 2.4 6.3 
Standard Deviation. 13.6 8.3 5.4 1.8 2.9 





=e: Sere 





Sh i ns ee > 
“ - ~ Pe. . ~ 
PT = Ape more 


: 
oa 
sd 





226 The Journal of Educational Psychology 


thus considered the mean score for family adjustment of the con- 
trol group from rural communities was the highest, 60.2 (Table 
8). The findings here are substantially the same. In both 
the urban and rural communities, the scores of the control 
group were significantly higher and less variable than those of the 
experimental. 

Further grouping of the rural and urban girls within the control 
and experimental groups showed that there was no significant 
difference between either the mean or the standard deviation of 
the rural and urban girls of the experimental group. There wasa 
very significant difference between the responses of the urban 
and rural girls in the control group in mean score as shown by the 
critical ratio of 3.2 in favor of the rural girls, but the standard 
deviations showed no significance difference (Table 9). 


TABLE 9.—SIGNIFICANCE OF DIFFERENCE OF MEAN AND 
STANDARD DEVIATION OF EXPERIMENTAL AND CONTROL 
Groups ACCORDING TO TyPE oF COMMUNITY 


Urban Rural 
N = 113 N = 38 Diff SEp CR 
Experimental Group 
1. 6 .58 
Standard Deviation. . e 9 8 


Control Group 
60. 
8. 


2 
3 


Standard Deviation. . 


COMPARISON OF TWO GROUPS ON RESPONSES TO VARIOUS ITEMS OF 
HOME LIFE SCALE 


Space does not permit a detailed presentation here of a com- 
parison of the responses of the two groups to the individual items 
of the scale. It was found that, when the items were ranked 
according to the number of favorable responses in each group, 
sixty-two of the eighty items received relatively the same ranks 
in both groups. The various items on the scale had the same 
relative importance for the two groups although the favorable 
responses were considerably higher in the control group than in 
the experimental group. 

Acceptable responses were received from fifty per cent or more 








Adjustment of Daughters of Employed Women 227 


of the control group on sixty-eight of the eighty items, while fifty 
per cent or more of the experimental group gave acceptable 
responses on only forty-eight of the eighty items. 

When the significance of the differences between the percentages 
of favorable responses in the two groups was determined for the 
various items it was found the control group was very significantly 
higher (critical ratio greater than 3) on thirty-nine of the eighty 
ifems, and significantly higher (critical ratio between 2 and 3) on 
twenty-three. On only two items were the responses of the 
experimental group slightly higher than those of the control 


group. 


The twenty items with the greatest difference between the two 
groups (critical ratio 3.79 or higher) are as follows: 


Is your mother at home when you get home from school? 

Does your father nag and scold? 

Do you think that either of your parents holds grudges 
against you? 

Do you get disgusted with the way your father acts in public? 

Where your affairs are concerned, do you think “what my 
folks don’t know won’t hurt them’’? 

Do you feel that your father likes you? 

Do you seem to get scolded for every little thing? 

Does your mother nag and scold? 

Does your father attend the school programs and other 
school activities in which you take part? 

Would you be more proud of your father if he would change 
some of his ways? 

Do your parents like to have your friends around? 

Do you think your family picks on you? 

Does your mother resent it when you disagree with her? 

Are you told to keep still when you try to argue with your 
father? 

Do other young people seem to have more fun with their 
families than you do with yours? 

Do you let your parents in on your ‘big moments’? 

Is your father too busy to pay any attention to the family? 

Do you ‘talk back’ to your father? 

Do you think ‘Oh what is the use!’ after you have tried to 
explain your conduct to your parents? 


Bales) <f Se 
“atieilce = 


ae 
- — + 
ie. BH a ee 
* ¥ nihia 
ae 


— 
gta 


oo 





. - . 
HIE HK, 


? -* ~~ 
A a eta | 
i . as) ae 
Rg Te ae 
ST Se a 


SS NS en Se a 


eg ee er a ee 
Se Se ca eee ee Ce 


228 The Journal of Educational Psychology 


Does your mother like to listen to what you tell her when you 
get home from school? 


DISCUSSION 


It is recognized that many factors are involved in a study of 
family adjustment and that, if association is found between 
maladjustment of the girls and employment of their mothers, this 
association cannot be interpreted to mean that only the employ- 
ment of the mother produced the maladjustment of the girl, [tis 
further-recognized here that a Mother may have full time to 

evote to the home, and yet, through clubs and various volunteer 
organizations, particularly during the war years, she may have so 
many activities that she has less actual time and energy for her 
family than does the woman who is employed. 

e findings of the study, however, indicate that the girls whose 
mothers are employed are, on the average, more poorly adjusted 
to family life than are those whose mothers do not work and that 
there is a greater feeling of lack of love, understanding, and inter- 
est between many parents and their daughters if the mother 
works. The responses also seem to signify that there is a greater 
lack of coéperation and appreciation on the part ofthe girls in the 
homes of employed mothers. A tendency toward domination by 
the parent and a reticence which might border on deception on the 
part of the daughter seems more prevalent in the home where the 
mother works outside the home. 

e rural girls in homes where the mother was not employed 
showed the best adjustment to family life. This is possibly due 
to the fact that there are fewer temptations for the rural girl, that 
she probably has more responsibilities, and that often she spends 
more time in being transported to and from school. The causes 
of this difference in home adjustment, however, are outside the 
realm of this study. 

The most significant difference in responses between the two 
groups was on—‘“‘Is.your mother home when you get home from 
school?’”’ Because there are fewer working mothers at home when 
the girl arrives, the girls are encouraged to loiter on the way home. 
Almost fifteen per cent more of these girls indicated that their 
parents disapproved of their friends than did the girls whose 
mothers were not employed. Loitering on the way home from 








Adjustment of Daughters of Employed Women 229 


school with undesirable companions makes it easy for girls to do 
things which might not be approved by the parents. 

More of the parents in the home where the mother worked did 
not seemingly approve of their daughters’ actions as much as did 
the parents of the girls in the homes where the mother did not 
work. According to the responses, the daughter of the employed 
mother feels this disapproval, real or imaginary, warranted or 
otherwise, because she is more inclined to consider that her 
“parents hold a grudge” against her, that she is “‘scolded for 
every little thing,” that her family ‘‘picks on her,” that her 
parents do not trust her to behave away from them, that she feels 
rebellious toward the family, and that the family treats her like a 
child. 

It seems reasonable to assume that with unsupervised time 
some of the daughters of the women who work may fall into the 
habit of doing things which they should not do. Wholesome 
family discussion of those things which are not approved by the 
family would bring about a better understanding and harmony, 
but it seems that friendly discussion of family problems is not 
common in the home where the mother is employed. 

The fact that many families of employed mothers have little 
time or inclination for family discussion regarding problems seems 
evident in the response denoting that both the fathers and the 
mothers nag and scold, that the girl is told to keep still when she 
tries to argue with her father or with her mother, that both the 
father and the mother resent it when the girl disagrees with them, 
that the parents do not listen to her side when she disagrees with 
them, that the mother tells her she must do a certain thing 
“because I say so,’’ and that the girl is not given a reason when she 
is forbidden to do something. 

Almost seventy per cent more of the girls of the experimental 
group than of the control group felt that where their affairs were 
concerned, ‘“‘what my folks don’t know won’t hurt them.” The 
responses to this item which seemed to show the trend of poor 
adjustment to family life showed a very significant difference 
between the two groups. This item, which has been designated 
by Stott as the key question, was fifth from the highest in terms 
of differences between the two groups. 

Over seventy per cent more of the daughters of the women who 
were employed thought, ‘‘Oh, what is the use!’’, after they had 





£ 
/ 
‘ 





230 The Journal of Educational Psychology 


tried to explain their conduct to their parents, and a larger num- 
ber of them also felt they did not deserve the punishment they 
received. These responses indicate that those girls feel that their 
parents are unfair. 

The tendency of the daughters of the employed mothers to feel 
a lack of love was shown by the responses of the greater number of 
them, compared with the girls whose mother did not work, indi- 
cating that they felt their fathers did not like them and that 
other parents seemed to like their children better than their own 
parents liked them. Lack of understanding on the part of the 
parent may be indicated by the greater number of girls of the 
experimental group compared with the control group who did 
not let their parents in on their ‘big moments’ and who felt that 
their parents did not help them overcome their mistakes. 

A greater number of girls whose mothers worked than those 
whose mothers did not work showed a feeling of lack of parental 
interest by their responses that their mothers did not like to listen 
to what the girls had to tell them when they came home from 
school, and that their fathers were too busy to pay any attention 
tothefamily. This feeling is substantiated by the fact that fewer 
parents in homes where the mothers worked were reported as 
attending school activities in which the girls took part than in the 
homes where the mothers were not employed. 

It is possible that the father in the home where the mother 
works may be having trouble with his adjustments to family life 
also, since responses to twelve items out of the thirty-nine show- 
ing very significant differences between the control and experi- 
mental groups dealt with the adjustment of the girl to the father. 
Furthermore, the fathers of the girls whose mothers worked were 
said to complain and to be ‘poor sports’ by a larger number than 
by those whose mothers did not work. 

There appears to be a tendency for the daughters of employed 
mothers to feel ashamed of their parents. This feeling was indi- 
cated by the number of responses to the questions asking if they 
became disgusted with the way both their father and mother 
acted in public, and if they would be ‘more proud’ of both father 
and mother if they would change some of their ways. Seventeen 
per cent more of the experimental group than of the control 
group indicated that they felt their parents did things that made 
them appear foolish. 





Adjustment of Daughters of Employed Women 231 


It appears that the home life is not as happy when the mother 
works as when she does not. Unfavorable answers of the girls 
whose mothers worked to the following questions would support 
this inference: 


1) Is your family breakfast a gloomy affair? 

2) Does your family have good times together? 

3) Do your parents like to have your friends around? 

4) Do other young people seem to have more fun with their 
family than you do with yours? 

5) Do you like to spend long winter evenings with your 
family group? 

6) Does every member of your family have ‘a say’ in what 
the family does as a group? 


Sixty-two per cent more of the daughters of employed women 
than those of the other group indicated they had more fun away 
from home than at home. It seems that an improvement in social 
relations at home among members of the family could do much to 
help the girl of the working mother in her adjustment to her 
home and possibly to society in general. 

A greater number of the experimental group than of the control 
group indicated that their homes lacked harmony by the responses 
which showed that there were times when some of the members 
of the family did not speak, that the daughter disagreed with the 
mother and that the daughter ‘talked back’ to the father. 

On items concerning the advice of parents, the girl whose 
mother worked was inclined to disregard the advice of her parents 
and to consider as unsound the advice of her father and of her 
mother. A possible explanation of this is that the girl in the 
absence of her mother has been accustomed to making her own 
decisions. 


SUMMARY 


The purpose of this study was to determine whether or not 
girls whose mothers worked outside the home full time were more » 
poorly adjusted to family life than were the girls whose mothers 
did not work. 

The study was made by means of a family adjustment scale, 
‘Home Life’ by Leland Stott, and a questionnaire which accom- 
panied the scale and which was designed to secure information 














232 The Journal of Educational Psychology 


about the community, the home of the girl, and the work of the 
mother. 

The sample was composed of three hundred two ninth- and 
tenth-grade ‘homemaking’ girls, half of whose mothers worked, 
called the experimental group, and half of whose mothers did not 
work, called the control group. These girls, all of normal 
families, came from eleven communities in Missouri, an equal 
number of each group from each community. About one-fourth 
of the girls were from rural communities. The girls were fairly 
comparable in age, most of them being fourteen or fifteen years 
old. Most of the mothers of the control group were usually 
home when the girls arrived from school but only 36.4 per cent 
of the mothers of the experimental group were home when the 
girls arrived from school. 

There was a very significant difference between the means 
(CR = 8.4) and the standard deviation (CR = 4.2) of the two 
groups, in favor of the control group, in adjustment to family life. 
In all the individual communities the means of the scores were 
higher for the girls in the control group than for those in the 
experimental group. The rural girls whose mothers did not work 
scored highest in their adjustment to family life. 

The control group was very significantly higher than the experi- 
mental group on almost half the items on the ‘Home Life’ scale. 
On twenty-three additional items there was a significant difference 
between the two groups in favor of the control group. 

On the question, ‘‘Where your affairs are concerned, do you 
think ‘what my folks don’t know won’t hurt them’ ’’?—which is 
considered the key question—thirty per cent more of the experi- 
mental than of the control group gave unfavorable responses. 

Responses seemed to indicate a greater feeling of lack of love, 
understanding, interest, and codperation between parents and 
daughters of the experimental group than between those of the 
control group. Responses indicated a tendency toward domina- 
tion on the part of the parents of the girls of the experimental 
group. There was an indicijion of more disapproval of the ac- 
tions of the daughters in the homes where the mothers worked. 

Twelve out of the thirty-nine items showing very significant 
differences between the two groups indicated unfavorable adjust- 
ment of the girls of the experimental group to their fathers. 
There was a tendency for the girls of the experimental group to 





Adjustment of Daughters of Employed Women 233 


feel ashamed of their parents. Girls of the experimental group 
seemed more inclined to disregard parental advice than did the 
girls of the control group. 


BIBLIOGRAPHY 


1) Beals, Frank L. “Wartime Problems of Children.” 
Hygeia, 22:268-9, 296, 298, 300. April, 1944. 

2) Essig, Mary. Adjustment of Girls in Homemaking Classes to 
Family Life. Master’s Thesis, 1945. Colorado A. and M. 
College. 

3) Keliher, A. V. “Expect This from Children—When 
Mothers Work.” Progressive Education, 20:335-7. November, 
1943. 

4) Mathews, Selma M. ‘The Effects of Mother’s Out-of- 
Home Employment Upon Children’s Attitudes and Ideas.” 
Journal of Applied Psychology, 18:116-136. February, 1934. 

5) Stott, Leland H. ‘‘Parent-Adolescent Adjustment, Its 
Measure and Significance.” Character and Personality, 10:140— 
150. December, 1941. 

6) U. S. Children’s Bureau. Understanding Juvenile Delin- 
quency. Washington, U. 8. Gov’t. Printing Off., 1943. 52p. 
(U.S. Children’s Bureau. Bureau Publication, No. 300.) 

7) Wright, Helen Russel. Children of Wage-earning Mothers. 
Washington, Gov’t. Printing Off., 1922. 92p. (U.S. Children’s 
Bureau. Bureau Publication, No. 102.) 








i 
¢ 
#0 








eee ot 
a 
ee 


sabe 8 Bn 





VETTES sS 


Pa & een 
ud - 


METHODS FOR DIRECT READING OF 
STANDARD SCORES ON AN 
ELECTRIC SCORING MACHINE 


ORRIS C. HERFINDAHL 
Sp(C)1/ce, U.S.N.R. 


In some testing programs raw scores are converted to stand- 
ard scores designed to show the deviation of a score from the 
arithmetic mean. When an electric scoring machine is used it 
is possible, with certain types of standard scores, to secure them 
from the machine either by reading them directly or by adding 
an easily addible constant to the machine score instead of using 
a conversion table as is usually done.’ 

In general terms, the standard score (SS) is a linear function 
(SS = a + bRS) of the raw score (RS), whatever the type of 
standard score used. Ordinarily a will be positive. The accom- 
panying diagram indicates the relationships. 

On an electric scoring machine, the machine score is always 
proportionate to the raw score, given the position of the rheo- 
stat. That is, the machine score = ORS,’ so that a given relative 
change in RS will produce the same relative change in the machine 
score. For example, suppose a test with eighty questions is 
being scored on a percentage basis, i. e., the machine score = 
ae 100 ( = 100 A raw score of 80 is set up to read 100. 
If, on the next paper, the raw score is twenty-five per cent less, 
or 60, the machine score will also be twenty-five per cent less, 
or 75. Any linear function passing through the origin possesses 
this property, that is, its elasticity equals one at every point.® 





1 The values of a and b depend, of course, on the raw score mean and 
standard deviation for the test in question and on the standard score formula 
that is used. For example, suppose a certain test uses the standard score 


formula SS = 70 (RS — RS)10 


TRS 
Assume RS = 80.0 and ors = 20.0 
Then, SS = (70 — 8%o-10) + 1%o RS = 30.0 + .5RS 
That is, a = 30.0 and b, which is the slope of the function SS, equals .5. 
2? When reading raw scores on the machine, b = 1. a 


t Elasticity = the relative change in SS + relative change in RS = Sow 


RS 





234 





Reading Scores on an Electric Scoring Machine 235 


But because it is very unlikely that a in the equation SS = 
a + bRS will be zero, it is generally impossible to set up the 
machine to read standard scores without the methods later dis- 
cussed. Suppose RS and ors for a certain test give the result 
SS = 20+ RS. A paper with a raw score of 50 is set up to 
read 45. The next paper with, say, a raw score 20 per cent 
higher (60) will read 54, whereas it should read 50, or eleven and 
one-ninth per cent higher. 




















Ly SS 
Oo SS 
< 
VY) 
‘am 
a a 
2 
eae 
< s 
YY as 

0 

RAW SCORE 
Fig. 1. 


The solution of the problem is to shift, in effect, the SS func- 
tion so that it passes through the origin. In Figure 1 this is 
indicated by the shift of SS to SS’. 


METHOD I 


The first method for doing this is to set up the machine so 
that an easily addible constant roughly equal to a will give the 
standard score when added to the machine score. 


SS = a+ bRS (1) 
Subtracting a and adding 4, 
SS—a+%=bRS+% (2) 


The effect of subtracting a is to shift the function downward 
until it intersects the origin. That is, SS becomes SS’. The 4% 





a - 


° a ~ ~~ 











236 The Journal of Educational Psychology 


has been added so that instead of reading the machine dial to 
the nearest whole number score, an indicator position equal to a 
given whole number score or less than the next higher score can 
be read as the given score. 

With the zero position of the indicator at 4, then, the machine 
is first set up with a test paper to read (SS — a + 14), as in 
equation (2).4 A given relative change in RS will now produce 
the same relative change in (SS — a), thus meeting the condition 
inherent in the design of the machine.* But it is very improbable 
that a will be either a whole number or an easily addible constant 
(C), such as 10, 20, or 30. Every score will be in error by the 
difference between C and a. Therefore a correction must be 
made. 


Adding — (C — a) to equation (2), 
SS—-C+%=a+bRS-—-C+ (3) 


This correction is made by changing the zero position of the indi- 
cator so that a test paper first set up to read (SS — a + 4) now 
reads (SS — C + 14), as in equation (3). The platen need not 
be released to make the change in the reading. This correction 
changes every machine reading by a constant, (C — a). Thus 
the arc of the indicator for any given raw score still remains the 
same as for equation (2). 

The machine is now ready for use. The operator calls off the 
machine score (SS — C + %). The recorder mentally adds C 
and writes the standard score on the roster.* If scores are being 
recorded on the test paper, the operator can easily add C. 


METHOD II 


Standard scores can be read directly from the machine if the 
raw score of every paper is increased so that, in effect, SS (see 
Figure 1) is moved to the right until it intersects the origin. 





* Setup data should be calculated to one digit beyond the decimal. The 
machine can not be set up this accurately, but a sufficiently close approxi- 
mation can be made. 

* Since the zero position of the indicator is at }¢ and each score reads }4 
point too high, the are of the indicator is (SS — a), which is directly pro- 
portional to RS. 

*C added to equation (3) gives SS + & = a+ bRS + 4, which is the 
desired result. Mistakes in adding C are exceedingly rare. 


Reading Scores on an Electric Scoring Machine 237 


To equation (2) add bs giving 
SS—a+bs+%=bRS+bs+\% (4) 


s is the amount by which the raw score of every paper is 
increased. This may be done conveniently by having the testees 
black in every choice in the sample box after the test booklets 
have been collected. Only s positions should be punched out 
of the sample box area on the key even though the testees have 
filled in all sample choices on their answer sheets.’ 

With the zero setting at 14, the machine is set up to read 
(SS — a + bs + 4), as in equation (4). The arc of the indi- 
cator is now proportional to (RS + 8). s should be chosen so 
that bs is as close to a as possible. Since s must, however, be a 
whole number, bs will equal a only by chance. Any given score 
will, therefore, be too large or too small by a constant amount. 
This is corrected by adjusting the zero position of the indicator, 
as in the previously discussed method, to the value of the left 
side of equation (5). This correction does not change the arc 
of the indicator. 


(a— bs) added to equation (4) gives 
SS+%=a+bRS+% (5) 


Standard scores can now be read directly from the machine. 


COMBINATION OF METHODS I AND II 


The first method, rather than the second, will have to be used 
in those cases where the sample box contains fewer than s spaces. 
If the first method must be used, it is possible that any easily 
addible constant will be so far from the value of a that an unduly 
large adjustment of the zero position of the indicator will have 
to be made. In this case, an addition to the raw score of each 
paper, while not large enough to enable direct reading of the 
standard score, may easily reduce the zero position correction 
to a small amount. The two methods are simply combined. 





7 It has been found that even testees of rather low attainments can follow 
instructions for filling in all choices in the sample box. No confusion as to 
instructions for the test proper carries over to the next test even if given 


immediately. 














238 The Journal of Educational Psychology 


The machine is first set up with a test paper so that it indicates 
an amount equal to the left side of equation (6). 


SS—a+bs+ % = bRS+bs+% (6) 


The score to which the indicator is now turned by the zero posi- 
tion adjustment is obtained by adding to equation (6) 


— (C —a+ bs) giving 
SS—C+%=a+bRS-—C+% (7) 


Addition of C by the recorder will give the correct standard score. 


HYPOTHETICAL EXAMPLE OF THE COMBINATION OF BOTH METHODS 


Development of the data necessary to read standard scores by 
adding a constant is shown in the example below. Assume the 
following data: 


8 choices in the sample box 
RS = 50.0 

Crs = 15.0 

C = 20 


The standard score formula used is 
SS = 60 + (RS — RS)10 


Ors 


10 
. 10) + is RS 





_ 50.00 
15.00 


Then, SS = (co 
= 26.67 + 24RS 


That is, a = 26.67, and b = 36. 
Suppose the machine is first set up with a prepared test paper, 
the raw score of which is 100. 


Then SS = 26.67 + 66.67 = 93.34 


Parenthetically, note that the raw score of every paper would 
have to be increased by 40 if SS were to be read directly without 
adding a constant.’ Since there are only eight choices in the 
sample box, we shall add 20 to the machine score, also adding 
eight marks to e-zry paper in order to reduce the zero correction 
from 6.67 to a manageable size. 





®s is chosen so that bs = a. In this case, 34s = 2634, or s = 40. 





Reading Scores on an Electric Scoring Machine 239 


The first score to be set up is given by equation (6) above: 
SS —a+ bs + 4 = 93.34 — 26.67 + 5.33 + .5 = 72.50 
The correction to be added to equation (6) is 
—(C —a+ bs) = —(20 — 26.67 + 5.33) = +1.34 


The indicator is then turned to 72.50 + 1.34 (= 73.84) by 
means of the zero position adjusting screw. C (= 20) added to 
73.84 = 93.84, which is .5 point higher than the standard score, 
as it should be. Data should be calculated for a few more test 
papers so that the accuracy of the machine over the relevant 
range can be checked. 

For the use of the operator, setup data might be arranged as 
in the following table: 


(1) (2) @) (4) (5) (6) (7) 
Turn zero Constant 
adjustment to be Points 


First score screw until added to blacked in 
to be set indicator machine in sample 


Test RS SS up reads: score box 
Spell- 100 93.3 72.5 73.8 20 8 
ing 60 66.7 45.8 47.2 3 " 


Note that the readings in col. (5) are .5 point higher after adding 
the constant of 20 than the actual standard score in col. (3) 
because scores are not read to the nearest whole number. 73.8 
would be read as 73 and would be recorded as 93. 


ADDITIONAL USES 


If two tests are given on the same answer sheet, both can be 
set up by throwing the master control switch from the A to the 
B fields, thus eliminating the necessity of running the papers 
through the machine twice. It will be necessary, however, to 
calculate (C — a + bs) for several values of s for each test.® 
Then a pair of s’s for the tests must be selected which will reduce 
(C — a+ bs) to a manageable indicator correction and which 
will make this quantity very nearly the same for both tests. 





* Since s must be a whole number, its solution can not be found alge- 
braically. 

















re 


monroe toys 


a3. 


a fen tn 3 
wi 


hae Fe TR ee Ee a 
ra as rs a 


Sw se ~ 








Disa dae bape aa 


PT NL Oe tiles oh =. smb mo ae 
ae Spe = oo eee Soe tis ries st 


240 The Journal of Educational Psychology 


The above development may easily be adapted to a raw scoring 
formula of rights minus some fraction of the wrongs. 


COMPARISION WITH USUAL METHODS 


With the procedure commonly used when scoring a test on an 
electric scoring machine and converting to standard scores, the 
operator of the machine calls off the raw score. The recorder 
looks at a conversion table to get the standard score and records 
it on a roster. In some cases the operator works without a 
recorder and locates the standard score on a conversion table, 
writing the score on the test paper itself or on a roster. Sizeable 
errors are of fairly frequent occurrence because of mistakes made 
in finding the standard score on the conversion table. Severe 
eye fatigue usually results since the use of a conversion table 
necessitates continual refocussing of the eyes. Some tests have 
a conversion table printed on the answer sheet with the standard 
scores placed directly below the corresponding raw scores. The 
operator reads the raw score on the machine and then circles the 
corresponding standard score on the answer sheet.'®° This pro- 
cedure is open to objection because the search for the proper raw 
score, necessarily printed in small type, produces eye fatigue. 
The ease with which eye fatigue develops can perhaps not be 
appreciated unless one has actually operated the machine for a 
period of several hours. 

The principal advantage to be gained from reading standard 
scores directly or by adding a constant is the reduction of oper- 
ator or recorder eye fatigue, with a consequent improvement in 
job attitudes. Whether a recorder writes scores in a grade book 
or the operator writes the score on the answer sheet, machine 
time is slightly less because there is no delay in locating the 
standard score on a conversion table. Once the setup data calcu- 
lated from the above equations have been put into an easily 
used table, the time necessary to set up the machine is little, if 
any, greater than that ordinarily required. It is obvious that 
the setup must be more carefully made. 





1° Tn any case, the machine can be operated more than twice as fast if the 
operator, instead of writing scores on the answer sheet, is assisted by a 
recorder who writes the scores on a muster roll or ina grade book. Thus, 
labor time used in operating the machine is no greater and it is unnecessary 
to later record scores in a grade book. 





Reading Scores on an Electric Scoring Machine 241 


It is clear that the state of accuracy of the machine would have 
a greater effect on the proposed methods than on the usual one. 
Careful attention to the condition of the machine is necessary. 
Experience indicates that large errors are relatively fewer with 
direct reading methods; small errors are somewhat larger in 
number, relatively, than with the usual method. The relative 
number of papers in error is slightly greater with direct methods. 

The methods discussed here have given satisfactory results for 
a period of several months in a situation requiring the grading 
of eight thousand and more test papers a week. 


seit? Zine . - 
7m ae 





nc oni 
I a Sn. 


— 


‘ 


rebow 





THE USEFULNESS OF CORRECTLY SPELLED 
WORDS IN A SPELLING TEST 


ALEXANDER G. WESMAN 
The Psychological Corporation 


One of the difficulties met in the construction of spelling tests of 
the recognition type is the difference in the validity of the item in 
its correct and incorrect forms. For example, in the spelling 
section of the Bennett Stenographic Aptitude Test, words which 
are correctly spelled as presented are not included in the scoring 
because these words were found ineffective. Only those words 
which are presented wrongly spelled contribute to the testee’s 
score. This procedure results in decreased efficiency of the test, 
since only half the items presented (fifty of one hundred) are 
actually used. The remaining items represent merely a necessary 
‘padding’ for the misspelled words. 

It is desirable for a test to have maximum efficiency. This can 
be accomplished by maximizing the reliability of every item. For 
the type of test under consideration it would involve: 


(a) Maximizing the reliability of the misspelled words; 
(b) Maximizing the reliability of the correctly spelled words. 


Since step ‘a’ is ordinarily observed in test construction, it is 
step ‘b’ which is more promising for improvement of present 
instruments of this nature. Maximizing the reliability of the 
correctly spelled words means the discovery, by experimentation, 
of appropriate words which will be effective test items when 
presented in their correctly spelled form. The technique 
described herein is one method of such experimentation. 

The technique used to obtain the desired result—a fully efficient 
spelling test of the true-false type—is relatively simple; however, 
since the author has found no reference in the literature to its 
previous use, it is reported herewith. 

The author wished to construct two equivalent one hundred- 
item forms of a spelling test which would be efficient tests. He 
judged that he would need to start with a pool of three hundred | 
items in order to have at least two hundred useful items for the 
two final forms of the test. A group of three hundred commonly 
misspelled words was selected from Gates’ A List of Spelling 

242 





Correctly Spelled Words in a Spelling Test 243 


Difficulties in 3876 Words, and the most frequent error form of 
each word was noted. Words were chosen from the top grade 
placement of Gates’ list (grade 8.0 and up) because of the intended 
level of the tests to be constructed. The three hundred words 
selected were then used to make two tests, each of three hundred 
items. Each test had half its items correctly spelled, and the 
other half of its items incorrectly spelled according to the most 
frequent error form noted in Gates. The tests were in a sense 
‘mirrors’ of each other, since a word spelled properly in one 
form was improperly spelled in the other form of the test. 

The tests were administered to the entire ninth grade of a 
junior high school (ninety-three boys and one hundred nine 
girls), half the students taking each form of the test. The two 
forms were similarly administered to the entire eleventh and 
twelfth grades of a senior high school containing eighty-one boys 
and one hundred seventeen girls. No attempt was made to 
analyze the data separately by sex. 

Coefficients of correlation of each item against the total score on 
the test were obtained for each word in its correctly and incor- 
rectly spelled form. These coefficients were determined sepa- 
rately for the junior and senior high school students. The 
highest-scoring twenty-seven per cent and lowest-scoring twenty- 
seven per cent on total score in each school group were selected 
as criterion groups. The per cent of the upper group which 
passed each item was compared with the per cent of the lower 
group passing that item. These values were then referred to 
Flanagan’s chart and the appropriate coefficient was found 
therein. Thus, four coefficients were obtained for each of the 
three hundred words:* (1) when correctly spelled with ninth- 
grade students; (2) when correctly spelled with eleventh- and 
twelfth-grade students; (3) when misspelled with ninth-grade 
students; and (4) when misspelled with eleventh- and twelfth- 
grade students. 

Table I presents distributions of correlation coefficients under 
the four conditions, together with the median and the quartile 
deviation of each distribution. Inspection of the table reveals 
that higher coefficients are obtained, on the average, with the 
incorrectly spelled forms of the words. The median coefficients 





* Because of typing errors, two words were incorrectly spelled in both 
forms, and one word was correctly spelled on both forms. 





y fy 


— 


2 AS er 


‘J a Lat 7 \ wri 
BiG Giese 


i 
~" Siam 
~e 


aa” oe = 
{pik al page 






as torr 








| 244 


The Journal of Educational Psychology 





TABLE I.—DISTRIBUTIONS OF CORRELATION COEFFICIENTS OF 
MISSPELLED AND CORRECTLY SPELLED Forms OF WORDS FOR 
NINTH-GRADE AND ELEVENTH- AND TWELFTH-GRADE 






































PuPILs 
Misspelled Correctly Spelled 
r 9th Grade lith & 12th 9th Grade llth & 12th 
N = 202 Grades N = 202 Grades 
N = 198 N = 198 
.84-.89 1 2 0 0 
.78—.83 13 8 1 0 
.72—-.77 16 16 8 3 
.66-—.71 27 29 14 11 
.60—.65 39 35 14 13 
qe .54-.59 32 34 20 7 
a * .48—.53 37 46 29 33 
ae .42-.47 20 32 43 32 
Tt .36-.41 27 22 30 36 
+f .80-.35 25 24 34 29 
ia: .24-. 29 22 16 26 41 
: i .18-—.23 11 12 20 17 
i | .12-.17 10 7 23 14 
a.” .06-.11 8 8 6 14 
co. .00—.05 3 5 21 34 
De | Negative 
a coefficients 
, — .01-— :39 








Correctly Spelled Words in a Spelling Test 245 


for misspelled forms of the words are .50 and .51 for the junior and 
senior high school groups, respectively, as compared with medians 
of .38 and .34 for the correctly spelled forms. The variability of 
the coefficients is relatively constant for all four conditions, 
the respective quartile deviations being .15, .14, .14, and .14. 

The data in Table I are in accord with previous experience that 
misspelled words are generally more discriminating. This is 
further confirmed by Table II, which shows the number of words 
which have higher coefficients for both school groups when mis- 
spelled, the number which are higher for a single group, and the 
number which are equal or lower in both groups.* Slightly more 
than half the words used have higher coefficients for the mis- 
spelled form at both grade levels, as against only thirty-one which 
are lower at both levels. This superiority of the misspelled words 
is emphasized when the individual grade levels are considered. 


TaBLE IJ.—A COMPARISON OF THE ITEM-TEST CORRELATION OF 
297 Worps AS PRESENTED IN MISSPELLED AND CORRECTLY 
SPELLED ForM 


9th I1lthand12th Both 
Grade Grades Grades 


r higher when misspelled........ 196 218 153 
Rh bss edccathawvalen ess 4 5 0 
r higher when correctly spelled... 97 74 31 


These data throw light on the relative usefulness of misspelled 
and correctly spelled words, to the advantage of the misspelled. 
From the test construction standpoint, however, it is equally 
important to note that there are a large number of words which 
are useful when correctly spelled. As shown in Table I, one 
hundred ninety-four of the correctly spelled words have item-test 
score coefficients of .30 or higher at the ninth-grade level, and 
one hundred sixty-five of these words have coefficients of that 
magnitude at the eleventh- and twelfth-grade level. Their 
use in a true-false spelling test would represent a real contribution 
to the test’s efficiency rather than acting as merely ‘padding.’ 

It is not here proposed that the best way to measure spelling 
ability is by means of the true-false recognition type of item. The 





* Three words are excluded from Table II: the two which were misspelled, 
and the one which was correctly spelled in both forms. 














246 The Journal of Educational Psychology 


answer to that question depends on the aspect of spelling ability 
to be measured, or the use for which spelling ability is intended. 
There are purposes (e.g., proofreading) for which the recognition 
type of item is best suited. 

It is probable that the use of this ‘mirror’ technique for such 
tests will enable other test constructors to find words which will 
be useful for their own specific applications. 

A useful study, or series of studies, might well be undertaken to 
determine the most desirable proportions of misspelled and cor- 
rectly spelled words to include in a test of this sort. For obvious 
practical reasons, a test consisting entirely of misspelled words 
would not be feasible. The best proportion, however, should be 
subject to experimental determination. 


REFERENCES 


1) Bennett, George K. Stenographic Aptitude Test, Manual 
of Directions. New York: The Psychological Corporation, 1945. 

2) Flanagan, John C. A Table of the Values of the Product 
Moment Coefficient of Correlation in a Normal Bivariate Population 
Corresponding to Given Proportions of Successes. New York: 
Codperative Test Service, 1936. 

3) Gates, Arthur I. A List of Spelling Difficulties in 3876 
Words. New York: Bureau of Publications, Teachers College, 
1937. Pp. 166. 





RATE OF PROGRESS AS RELATED TO DIFFICULTY 
OF ASSIGNMENT 


WILLIAM C. F. KRUEGER 
Wayne University, Detroit, Michigan 


This experiment was designed to observe the relationship 
between the rate of progress while in a learning situation and the 
varying difficulty of the tasks assigned. More specifically stated 
the problem was: What is the rate of acceleration in a curve of 
learning when the difficulty of the assignment varies? 

The series of experiments required the learning of lists of 
nonsense syllables. The lists varied in length and in difficulty. 
The learning lists consisted of series of five, fifteen, fifty or one 
hundred nonsense syllables. In a series of ‘easy’ syllables all 
items had a meaning value ranging from seventy-five per cent to 
one hundred per cent as organized by the author.! The ‘difficult’ 
series contained items selected from the opposite of that classifica- 
tion. The task was to memorize a prescribed number of nonsense 
syllables so that these items could be recognized correctly when 
mixed with an equal number of additional items of the same type. 

The procedure for learning and testing was as follows. The 
experimenter spelled each syllable of the learning list. Two 
seconds were allotted for each item which permitted a brief pause 
between any two successive syllables. After the learning list 
had been spelled the subject was given a recognition test. He was 
presented with a printed list of syllables in which one-half were 
the items contained in the learning series while the other half 
consisted of additional syllables of the same level of difficulty. 
All syllables were intermingled at random order. To make sure 
that the subjects attended to every item of the recognition list, the 
experimenter spelled aloud each syllable, allowing two seconds per 
item. During the interval the subject had to make his judgment 
for that particular syllable. No one was allowed to work ahead 
nor to change his judgment after the time interval has passed. 
Next followed the second spelling of the learning list which was 
done in the manner as before. Then followed the second test. 
Thus learning and test were alternated until the desired number of 
twelve tests had been completed. 

1 Krueger, Wm. C. F., “‘The Relative Difficulty of Nonsense Syllables,”’ 


Journal of Experimental Psychology, 1934, vol. 17, pp. 145-153. 
247 








ty 
ath 





~ he ee Se SR 





mae ie 


ig 

4 
i 

P 
t i 
. e 


pin S, Cn> ~ Ss a 
Ce ee ee ge 


eit ean as ee ee a eae FS 





248 The Journal of Educational Psychology 


Two preliminary practice sessions preceded the regular experi- 
ment to acquaint the learner with the procedure and to make 
sure that each subject knew what was required of him. A care- 
fully counter-balanced order. was followed to compensate for 
the additional practice effects and the usual factors present in this 
type of experimentation. Forty undergraduate students were 








seo £8 
NS D 
S 
N 
ge : 
s 
x A 
‘7 fo * 
: 
fiat ; 
so] 
e > 
e Sor é 
, Fe vases ets 
# $0 2 F i ge Se ie 








SUCCESSIVE TESTS 
Fig. 1. Curves of Learning Derived from Tasks of Varying Difficulty. 


Key: (A) Five easy items. (B) Fifteen easy items. (C) Fifty easy items. 
(D) One hundred easy items. (E) Five difficult items. (F) Fifteen difficult 
items. (G) Fifty difficult items. (H) One hundred difficv't items. 
used for each of the experimental conditions of this study. The 
groups were equated in so far as this was feasible. 

In order to measure the extent of learning attained the learner 
marked after each syllable during the test readings whether the 
syllable just spelled was one of the ‘old’ (meaning contained in the 
learning) or one of the ‘new’ (these added to the original learning 
list) items. His learning score was then determined by the num- 
ber of ‘old’ items marked correctly as ‘old’. The averages for 
the respective groups, each based upon forty scores with a few 
exceptions in which only thirty-nine subjects were available, were 


Rate of Progress as Related to Difficulty 249 


tabulated.! By transmuting the absolute scores into percentages 
of the total assignment, we obtained another way of stating the 
same information. These relative measures were used to con- 
struct Figure I. 7 

The learning of the lists of five easy (Curve A) yields a nega- 
tively accelerated curve. The five difficult items (Curve E) 
showed also the trend of negative acceleration. The task of 
learning fifteen easy syllables indicated the same trend. How- 
ever, the fifteen difficult items produced an almost straightline 
curve with a possibly slight trend toward positive acceleration. 
By comparing Curves C and D we again found the same trend as 
with Curves B and F, respectively. The two sets of 100-item 
lists both indicate positive acceleration for both the easy and the 
difficult assignment. 


CONCLUSIONS 


1) This experiment yielded eight curves of learning, each with 
its own rate of acceleration. 

2) As the tasks increased in difficulty, the corresponding learn- 
ing curves gradually changed from a definitely negatively. acceler- 
ated rate toward a definitély positive acceleration. 

3) The rate of progress made during the learning period ap- 
peared to be a function of the difficulty of the assignment. 

4) If the several sections of this experiment can be regarded as 
partial studies of a larger problem, the results obtained may 
express this trend: When an assignment is relatively easy for the 
learner, the progress to be expected is at a rate of negative 
acceleration. If, however, the task is rather difficult for the 
learner, we may anticipate a positively accelerated progress. As 
the tasks shift from relatively difficult to relatively easy assign- 
ments, we may expect a corresponding changé in the rate of 
progress made by the learner. 





1 Because of restrictions at present placed on special matter, the detailed 
table of average learning scores for the various conditions is omitted. The 
author will be glad to supply mimeographed copies to those desiring them. 








we - 
i? 
“ 
? 
pi 
te 














“—- w ciao - 


ssh pts aaes ae “ee 
oS Se. es ew ee “ : 
SES SERRE. i: 





BOOK REVIEWS 


Ren& LeSenne. T'raité de Caractérologie. Paris: Presses Uni- 
versitaires de Paris, 1945, pp. 648. 


The first book to be received for review from postwar France 
is a volume on Characterology by René Le Senne, professor at 
the Sorbonne and author of two earlier volumes on philosophy 
and general morality. 

This traité is not strictly a psychological work, but rather a 
philosophical discussion of common types of character, differ- 
entiated one from another by certain criteria. Character is 
limited by definition, at the outset, to the summation of heredi- 
tary dispositions which form the mental framework of a person. 
Emphasis is laid on three elements in this definition; namely, the 
hereditary or non-acquired basis of character, its permanence 
of structural identity, and the unification which it achieves of 
mind and body. In a methodical analysis of several meanings of 
character the author distinguishes between the narrow sense in 
which he employs the term and a broader, more common meaning 
which has a moral tone. He shows that German writers, such as 
Alfred Adler and others from whom he draws much evidence, 
have extended the meaning of character to include the ways in 
which an individual develops his hereditary dispositions; but 
for him the definition remains: 

“‘’’ensemble des dispositions congénitales qui forme le squelette 
mental d’un homme.” 

On the assumption that every person has character in this 
limited sense, Le Senne rests the claim to importance of his 
inquiry. If a determining character can be identified for 
individuals or groups of similarly constituted individuals, pre- 
dictions of reactions can be made which will have both personal 
and social significance. Such general types of character, he 
states, are better than the universals of the idealist and the 
abstract averages of the empiricist, because they offer more 
accurate descriptions of concrete, flesh and blood humanity. 
Only through a philosophic inquiry, however, can the underlying 
and pervading natures of these various individuals be understood. 

In carrying forward his study of character Le Senne places 
great stress on a hierarchy of sciences from physical through 

250 





Book Reviews 251 


physiological to characterological. At the highest level is the 
study of character, which, in turn, becomes involved in social 
results of behavior. Characterology is also considered to be on 
the same plane as the study of pathological individuals in 
psychiatry, from which supporting evidence is drawn about 
normal character. As a final discipline, however, characterology 
draws from all sciences and encompasses a unity or various 
unities of which the human spirit is capable. 

The method of research employed to validate this position is 
largely intuitive: 

“Au coeur de la caractérologie doit donc toujours se trouver 
Vintuition caractérologique.’”’ (p. 34.) 

Confirmation of the intuitive premises rests on selected empirical 
data and a subjective, spiritual evaluation of each individual 
who is: 

“une Ame impossible & remplacer.”’ (p. 44.) 

Several Twentieth Century writers on characterology, both 
in France and in Germany, are cited for support in a chapter 
devoted to documentation. Of these Gerard Heymans and 
E. Wiersma of the University of Groningen are given credit as 
the principal exponents of a type-classification similar to that 
of Le Senne. Earlier discussions of types are to be found among 
the Greeks, as in Hippocrates and Galen, who rested their 
classification upon the four humors. Had they utilized the 
differentiation of active and inactive, they would have formulated 
the same number of types. 

In arriving at the eight types of character delineated in this 
treatise, Le Senne postulates three essential traits: emotivity, 
activity, and retention of impressions. Individuals are said 
to differ in the degree to which they possess these qualities. 
Emotivity is defined as a more or less strong disturbance in the 
organic and psychological life of an individual, caused by events 
which in themselves may be unimportant. Satisfaction of 
emotivity through appropriate action is said to result in senti- 
ment, while a failure to find an outlet for this energy produces 
emotion. The difference between the objective importance of 
an event and the subjective disturbance is indicative of the 
degree of emotivity. Activity, as used in this characterological 
study, differs from its more common usage in that apparent or 
manifest activity is not meant. Instead, individuals are judged 








gost 


252 The Journal of Educational Psychology 


to be active to the degree that they are spurred on by an obstacle 
placed in the direction of their action, and inactive when dis- 
couraged by such an obstacle. Some individuals become 
conscious of events at once and are said to be primary, while 
others depend largely upon reflection for their experience and 
are designated as secondary. Certain supplementary traits, 
such as the size of the conscious field, analytical intelligence, 
egocentrism, and predominant tendencies, are also postulated 
as bases for character types. 

Combining the essential traits in every possible manner, 
Le Senne enumerates the following eight character-types, each 
term being limited to his own definition of it: 


* A os > coy = Ee wig Ae 
* Sar: nee ~— wg me +e = re ~ 


ee, 


oe 


se 
a al 


pe & 


Combination of Traits Qualities Examples 
Emotive-inactive-primary nervous Byron, Chopin, 
Poe, Sterne 


Emotive-inactive-secondary sentimental Amiel, Thackeray, 
(or melancholic) 


Emotive-active-primary choleric Danton, Casanova, 
Dickens, Scott 


Emotive-active-secondary impassioned Napoleon, Tolstoi, 
Goethe 


Non-emotive-active- sanguine Bacon, Machiavelli, 
primary Voltaire 

Non-emotive-active- phlegmatic Kant, Bentham, 
secondary Darwin, Franklin 


Non-emotive-inactive- amorphous Louis XV 

primary 
Non-emotive-inactive- apathetic Louis XVI 

secondary 

In the chapters which form the bulk of the study, each of 
these types is illustrated in two ways—by statistical data from 
a questionnaire study sent to three thousand German and Dutch 
doctors by Heymans and Wiersma and by biographical analyses 
of one hundred ten personages by the same writers. 





Book Reviews | 253 


At the conclusion of his discussion of these special types of 
character, the author indicates briefly the importance of his 
classification in the various fields of mental health, inter-personal 
relations, education, criminology, psychiatry, and politics. 
The values set forth are general in nature, depending mostly 
on an increase in empathy between the student of characterology 
and persons possessing other character types. 

Several difficulties present themselves in a philosophic study 
of this sort. In the first place, many basic assumptions must 
be made by the author which may or may not be acceptable to 
the reader. In the present instance, a dichotomy between mind 
(or spirit) and body is assumed. Furthermore, as indicated 
above, the hereditary uniqueness of any individual is said to 
persist throughout life in spite of the effects of the environment 
and life-long experiences. This continuity is thought of as being 
in some way structural. Secondly, the method of analysis is 
primarily intuitive, supported by limited biographical and 
statistical studies. Much reliance is placed throughout the 
volume on analogy—‘‘des analogies, non sans valeur” (p. 8), 
common sayings, subjective judgments of the author, and 
generalized human experience. What appears obvious to 
Le Senne does not lend itself to critical and objective analysis 
of any kind. A moral tone runs through the work in spite of 
the limitation of the definition of character which is made at the 
beginning. It becomes evident that the author is not concerned 
with knowledge about human behavior for its own sake: 

“La caractérologie ne vaudrait pas une heure de peine si 
elle ne permettait pas d’ameliorer les actions humaines.”’ (p. 16.) 

Fundamentally an idealist, Le Senne emphasizes the indi- 
viduality of each child of whose education he remarks: “son 
individualité idéale est le but supréme.” He therefore discovers 
throughout human existence a continuity and consistency among 
his eight types of character which other writers have not found. 
Since G. W. Allport, in Chapter III of his book on personality, 
has so ably summarized the other major classifications, a further 
discussion of them is unnecessary. 

The chief value of this traité lies not in the method nor in the 
findings, but rather in the author’s presentation of the limitations 
of a mechanistic analysis of human behavior. Educators will 


EM BIMOS rs 6 


ss. > ee - ¥ 
pp Bek ae 
6 te 











254 The Journal of Educational Psychology 


generally agree that the ‘average’ person described by science 
has no exact counterpart in the world of living beings. They 
will also accept the criticism that the most accurately interpreted 
tests reveal the least about an individual because they measure 
the most specific behavior. Even though educators and psy- 
chologists recognize such limitations in scientific investigations 
of personality, few will be led to accept the eight general types 
of character solely upon the evidence presented by Le Senne. 
What he has contributed by distinguishing underlying, inherited 
patterns of behavior is to re-emphasize the necessity for further 
investigations of temperament and predispositions. He has 
perhaps revived the historical interest in the unique and dis- 
tinctive nature of each individual for modern theorizing and 
extended analysis. Newton R. CaLHoun 
University of Chicago 


Frep M. Fowuer. Selection of Students for Vocational Training. 
Washington: U.S. Office of Education, Vocational Division 
Bulletin No. 232, Occupational Information and Guidance 
Series No. 13, 1945, pp. 156. 


The background of this bulletin is the descriptive report of 
practices receives from one hundred sixty schools in thirty-four 
states, the District of Columbia and Puerto Rico. The part 
played by the prospective trainee in choosing an occupational 
goal, and consequently the training that leads to it, is conceived 
as the more important part of the process of selection. This 
selection is best accomplished through the guidance program. 
Underlying principles in the program include: The vocational 
program must bring good job adjustment, provide the proper 
number of suitably trained workers to the employer, and serve 
the best interests of the community. The decision, intelligently 
made by the trainee, to enter a particular course is as important 
a part of the selection as the decision of the school to admit him. 
This decision must be made in terms of reliable information 
about job and training opportunities in addition to information 
about the trainee’s own abilities, aptitudes, interests and personal 
adjustment. Counseling service should be available to the 
student during the school years prior to entering vocational 





Book Reviews 255 


training. It should continue during the training and also during 
adjustment on the job after training. 

Guidance procedures essential to intelligent selection of train- 
ing include the individual inventory, informational service, 
counseling service, placement service and follow-up service. 

The selected list of references appended reveal acquaintance 
with up-to-date source material. An extensive group of record 
forms useful in the guidance program are also given in the 
appendix. 

The guidance program outlined in this bulletin is well con- 
ceived, carefully organized and practical. Furthermore, it is 
based upon sound psychological principles. The treatise should 
be influential in promoting more adequate selection of students 
for vocational training. Mies A. TINKER 

University of Minnesota 


MitToN Harrineton. The Management of the Mind. New 
York: Philosophical Library. 1945, 200 pp. 


The present small volume is a posthumous account of the 
author’s theories of mental hygiene and psychotherapy based on 
his theoretical position as formulated in his ‘‘The Biological 
Approach to the Problem of Abnormal Behavior.” Dr. Ralph 
B. Winn has edited the volume from a manuscript and notes left 
by Dr. Harrington at his death in 1942. 

The author is outspokenly antagonistic to psychoanalysis, and 
apparently to any dynamic theory of personality. His is a 
behavioristic and physiological theory which relates all behavior 
to flow of nervous energy, and abnormal behavior to interference 
with adequate flow. Comment on the theory can only label it 
as naive and very probably inadequate. 

The position is fairly represented in a quotation. ‘‘ . . . these 
ills are due to defects of personality resulting from bad heredity, 
physical disease and bad education, and to difficult situations, 
with which, owing to his personality defects, the patient is unable 
to cope. Treatment for us, therefore, is a matter of helping the 
patient to correct his personality defect, by improving his bodily 
health and by re-education, and of helping him to deal with the 
difficult situations of life, with which he is confronted.”’ 














256 The Journal of Educational Psychology 





_ In spite of the theory the author does have some useful 

advice concerning the handling of simple behavior difficulties. 

This is a readable book and could be recommended to lay per- 

sons, but there is great danger that the whole presentation makes 

the problems of mental health much more simple than they 

actually are. C. M. Lourtir 
Ohio State University 












—_- - + hal 7 
{se . 7 : . , - r - 
oo i ee weg FF. ne ae » <_™ 
¥¢ int +, _ 
- ‘ Pe es Se ~~ OE cx 7 7 . - e 
++ err -~-¥ = —" 
- - ee a ee 2 =e us 
~*owe - eS ew ne gy Pee ee ~~ ws 


















7 









